You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear Developer,
I am a biomedical student and fresh to coding, thank you for offering this great software.
I met with multiple problems in ptm_data_import related to :
human_fasta = fasta.IndexedUniProt('../data/human_fasta/uniprot-filtered-organism__Homo+sapiens+(Human)+[9606]_.fasta')
I could not find this .fasta file anywhere, thus was unable to process the [import_ubi_library_data.ipynb] and [import_sugiyama_data.ipynb].
Input In [4], in extract_region_boundaries(df)
1 def extract_region_boundaries(df: pd.DataFrame) -> pd.DataFrame:
----> 2 start = [x.split('-')[0] for x in df["UniProt boundaries"]]
3 end = [x.split('-')[1] for x in df["UniProt boundaries"]]
4 df["start"] = start
File ~\anaconda3\envs\structuremap\lib\site-packages\pandas\core\frame.py:3505, in DataFrame.getitem(self, key)
3503 if self.columns.nlevels > 1:
3504 return self._getitem_multilevel(key)
-> 3505 indexer = self.columns.get_loc(key)
3506 if is_integer(indexer):
3507 indexer = [indexer]
File ~\anaconda3\envs\structuremap\lib\site-packages\pandas\core\indexes\base.py:3623, in Index.get_loc(self, key, method, tolerance)
3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
-> 3623 raise KeyError(key) from err
3624 except TypeError:
3625 # If we have a listlike key, _check_indexing_error will raise
3626 # InvalidIndexError. Otherwise we fall through and re-raise
3627 # the TypeError.
3628 self._check_indexing_error(key)
KeyError: 'UniProt boundaries'
I also experienced multiple errors while running through data_anaylsys_structuremap.ipynb.
Prepare The Environment:
**In: from accessory_functions import ***
ModuleNotFoundError Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 from accessory_functions import *
ModuleNotFoundError: No module named 'accessory_functions'
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\plotly\express_chart_types.py:350, in bar(data_frame, x, y, color, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, hover_name, hover_data, custom_data, text, base, error_x, error_x_minus, error_y, error_y_minus, animation_frame, animation_group, category_orders, labels, color_discrete_sequence, color_discrete_map, color_continuous_scale, range_color, color_continuous_midpoint, opacity, orientation, barmode, log_x, log_y, range_x, range_y, title, template, width, height)
306 def bar(
307 data_frame=None,
308 x=None,
(...)
344 height=None,
345 ):
346 """
347 In a bar plot, each row of data_frame is represented as a rectangular
348 mark.
349 """
--> 350 return make_figure(
351 args=locals(),
352 constructor=go.Bar,
353 trace_patch=dict(textposition="auto"),
354 layout_patch=dict(barmode=barmode),
355 )
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\plotly\express_core.py:1854, in make_figure(args, constructor, trace_patch, layout_patch)
1852 prefix = get_label(args, args["facet_row"]) + "="
1853 row_labels = [prefix + str(s) for s in sorted_group_values[m.grouper]]
-> 1854 for val in sorted_group_values[m.grouper]:
1855 if val not in m.val_map:
1856 m.val_map[val] = m.sequence[len(m.val_map) % len(m.sequence)]
KeyError: 'cutoff'
Find short unstructured regions within large folded domains
Dear Developer,
I am a biomedical student and fresh to coding, thank you for offering this great software.
I met with multiple problems in ptm_data_import related to :
human_fasta = fasta.IndexedUniProt('../data/human_fasta/uniprot-filtered-organism__Homo+sapiens+(Human)+[9606]_.fasta')
I could not find this .fasta file anywhere, thus was unable to process the [import_ubi_library_data.ipynb] and [import_sugiyama_data.ipynb].
I met with a KeyError in [IDR_benchmark.ipynb]:
In: disordered_data = pd.read_csv('/Users/nc1/StructuremapDEV/structuremap/data/order/disordered_regions.csv',sep=";")
disordered_data = extract_region_boundaries(disordered_data)
print(disordered_data[0:3])
disordered_data_annotation = get_disorder_annotation(df=disordered_data)
disordered_data_annotation = disordered_data_annotation.rename(columns={"structure": "disordered"})
print(disordered_data_annotation[0:3])
KeyError Traceback (most recent call last)
File ~\anaconda3\envs\structuremap\lib\site-packages\pandas\core\indexes\base.py:3621, in Index.get_loc(self, key, method, tolerance)
3620 try:
-> 3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
File ~\anaconda3\envs\structuremap\lib\site-packages\pandas_libs\index.pyx:136, in pandas._libs.index.IndexEngine.get_loc()
File ~\anaconda3\envs\structuremap\lib\site-packages\pandas_libs\index.pyx:163, in pandas._libs.index.IndexEngine.get_loc()
File pandas_libs\hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item()
File pandas_libs\hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'UniProt boundaries'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Input In [6], in <cell line: 2>()
1 disordered_data = pd.read_csv('/Users/nc1/StructuremapDEV/structuremap/data/order/disordered_regions.csv',sep=";")
----> 2 disordered_data = extract_region_boundaries(disordered_data)
3 print(disordered_data[0:3])
4 disordered_data_annotation = get_disorder_annotation(df=disordered_data)
Input In [4], in extract_region_boundaries(df)
1 def extract_region_boundaries(df: pd.DataFrame) -> pd.DataFrame:
----> 2 start = [x.split('-')[0] for x in df["UniProt boundaries"]]
3 end = [x.split('-')[1] for x in df["UniProt boundaries"]]
4 df["start"] = start
File ~\anaconda3\envs\structuremap\lib\site-packages\pandas\core\frame.py:3505, in DataFrame.getitem(self, key)
3503 if self.columns.nlevels > 1:
3504 return self._getitem_multilevel(key)
-> 3505 indexer = self.columns.get_loc(key)
3506 if is_integer(indexer):
3507 indexer = [indexer]
File ~\anaconda3\envs\structuremap\lib\site-packages\pandas\core\indexes\base.py:3623, in Index.get_loc(self, key, method, tolerance)
3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
-> 3623 raise KeyError(key) from err
3624 except TypeError:
3625 # If we have a listlike key, _check_indexing_error will raise
3626 # InvalidIndexError. Otherwise we fall through and re-raise
3627 # the TypeError.
3628 self._check_indexing_error(key)
KeyError: 'UniProt boundaries'
I also experienced multiple errors while running through data_anaylsys_structuremap.ipynb.
Prepare The Environment:
**In: from accessory_functions import ***
ModuleNotFoundError Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 from accessory_functions import *
ModuleNotFoundError: No module named 'accessory_functions'
Annotate IDRs >> IDR_benchmark notebook - Visualize pPAE cutoff
**In: pPSE_cut = px.bar(bincount_df, x='pPSE',y='count', color='cutoff',
color_discrete_map={'high exposure':'rgb(177, 63, 100)',
'low exposure':'grey'},
template="simple_white",
width=500, height=300)
pPSE_cut = pPSE_cut.update_layout(legend=dict(
title='',
yanchor="top",
y=0.99,
xanchor="right",
x=0.99
))
config={'toImageButtonOptions': {'format': 'svg', 'filename':'pPAE_cutoff'}}
pPSE_cut.show(config=config)**
KeyError Traceback (most recent call last)
Input In [30], in <cell line: 1>()
----> 1 pPSE_cut = px.bar(bincount_df, x='pPSE',y='count', color='cutoff',
2 color_discrete_map={'high exposure':'rgb(177, 63, 100)',
3 'low exposure':'grey'},
4 template="simple_white",
5 width=500, height=300)
6 pPSE_cut = pPSE_cut.update_layout(legend=dict(
7 title='',
8 yanchor="top",
(...)
11 x=0.99
12 ))
13 config={'toImageButtonOptions': {'format': 'svg', 'filename':'pPAE_cutoff'}}
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\plotly\express_chart_types.py:350, in bar(data_frame, x, y, color, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, hover_name, hover_data, custom_data, text, base, error_x, error_x_minus, error_y, error_y_minus, animation_frame, animation_group, category_orders, labels, color_discrete_sequence, color_discrete_map, color_continuous_scale, range_color, color_continuous_midpoint, opacity, orientation, barmode, log_x, log_y, range_x, range_y, title, template, width, height)
306 def bar(
307 data_frame=None,
308 x=None,
(...)
344 height=None,
345 ):
346 """
347 In a bar plot, each row of
data_frame
is represented as a rectangular348 mark.
349 """
--> 350 return make_figure(
351 args=locals(),
352 constructor=go.Bar,
353 trace_patch=dict(textposition="auto"),
354 layout_patch=dict(barmode=barmode),
355 )
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\plotly\express_core.py:1854, in make_figure(args, constructor, trace_patch, layout_patch)
1852 prefix = get_label(args, args["facet_row"]) + "="
1853 row_labels = [prefix + str(s) for s in sorted_group_values[m.grouper]]
-> 1854 for val in sorted_group_values[m.grouper]:
1855 if val not in m.val_map:
1856 m.val_map[val] = m.sequence[len(m.val_map) % len(m.sequence)]
KeyError: 'cutoff'
Find short unstructured regions within large folded domains
**In: proteins_with_pattern = alphafold_accessibility_smooth_pattern_ext[alphafold_accessibility_smooth_pattern_ext.flexible_pattern==1].protein_id.unique()
textfile = open("data/short_idrs/proteins_with_pattern.txt", "w")
for element in proteins_with_pattern:
textfile.write(element + "\n")
all_proteins = alphafold_accessibility_smooth_pattern_ext.protein_id.unique()
textfile = open("data/short_idrs/all_proteins.txt", "w")
for element in all_proteins:
textfile.write(element + "\n")
textfile.close()**
FileNotFoundError Traceback (most recent call last)
Input In [34], in <cell line: 3>()
1 proteins_with_pattern = alphafold_accessibility_smooth_pattern_ext[alphafold_accessibility_smooth_pattern_ext.flexible_pattern==1].protein_id.unique()
----> 3 textfile = open("data/short_idrs/proteins_with_pattern.txt", "w")
4 for element in proteins_with_pattern:
5 textfile.write(element + "\n")
FileNotFoundError: [Errno 2] No such file or directory: 'data/short_idrs/proteins_with_pattern.txt'
David enrichment analysis of GO MF
In: enrichment_david = pd.read_csv('data/short_idrs/pattern_enrichment.txt', sep='\t')
FileNotFoundError Traceback (most recent call last)
Input In [37], in <cell line: 1>()
----> 1 enrichment_david = pd.read_csv('data/short_idrs/pattern_enrichment.txt', sep='\t')
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\pandas\util_decorators.py:311, in deprecate_nonkeyword_arguments..decorate..wrapper(*args, **kwargs)
305 if len(args) > num_allow_args:
306 warnings.warn(
307 msg.format(arguments=arguments),
308 FutureWarning,
309 stacklevel=stacklevel,
310 )
--> 311 return func(*args, **kwargs)
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\pandas\io\parsers\readers.py:680, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
665 kwds_defaults = _refine_defaults_read(
666 dialect,
667 delimiter,
(...)
676 defaults={"delimiter": ","},
677 )
678 kwds.update(kwds_defaults)
--> 680 return _read(filepath_or_buffer, kwds)
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\pandas\io\parsers\readers.py:575, in _read(filepath_or_buffer, kwds)
572 _validate_names(kwds.get("names", None))
574 # Create the parser.
--> 575 parser = TextFileReader(filepath_or_buffer, **kwds)
577 if chunksize or iterator:
578 return parser
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\pandas\io\parsers\readers.py:933, in TextFileReader.init(self, f, engine, **kwds)
930 self.options["has_index_names"] = kwds["has_index_names"]
932 self.handles: IOHandles | None = None
--> 933 self._engine = self._make_engine(f, self.engine)
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\pandas\io\parsers\readers.py:1217, in TextFileReader._make_engine(self, f, engine)
1213 mode = "rb"
1214 # error: No overload variant of "get_handle" matches argument types
1215 # "Union[str, PathLike[str], ReadCsvBuffer[bytes], ReadCsvBuffer[str]]"
1216 # , "str", "bool", "Any", "Any", "Any", "Any", "Any"
-> 1217 self.handles = get_handle( # type: ignore[call-overload]
1218 f,
1219 mode,
1220 encoding=self.options.get("encoding", None),
1221 compression=self.options.get("compression", None),
1222 memory_map=self.options.get("memory_map", False),
1223 is_text=is_text,
1224 errors=self.options.get("encoding_errors", "strict"),
1225 storage_options=self.options.get("storage_options", None),
1226 )
1227 assert self.handles is not None
1228 f = self.handles.handle
File ~\anaconda3\envs\structuremap_analysis\lib\site-packages\pandas\io\common.py:789, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
784 elif isinstance(handle, str):
785 # Check whether the filename is to be opened in binary mode.
786 # Binary mode does not support 'encoding' and 'newline'.
787 if ioargs.encoding and "b" not in ioargs.mode:
788 # Encoding
--> 789 handle = open(
790 handle,
791 ioargs.mode,
792 encoding=ioargs.encoding,
793 errors=errors,
794 newline="",
795 )
796 else:
797 # Binary mode
798 handle = open(handle, ioargs.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'data/short_idrs/pattern_enrichment.txt'
Then there are continouus NameError after it.
The text was updated successfully, but these errors were encountered: