Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "NameError: name 'partition_md' is not defined" when running #118

Open
afxjzs opened this issue Aug 2, 2023 · 0 comments
Open

Comments

@afxjzs
Copy link

afxjzs commented Aug 2, 2023

This code from the 03. Retrieval notebook in the LLM course is causing an error

from langchain.document_loaders import DirectoryLoader

def find_md_files(directory):
    "Find all markdown files in a directory and return a LangChain Document"
    dl = DirectoryLoader(directory, "**/*.md")
    return dl.load()

documents = find_md_files('../docs_sample/')
len(documents)

is throwing the following error:
NameError: name 'partition_md' is not defined

  • confirmed using python 3.11.4 in both VS Code and Jupyter-lab and in two different environments
  • using pip install ... 'langchain[all]' ... didn't fix it
  • using from langchain.loaders import DirectoryLoader instead didn't work (via SO)

Full Error:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 8
      5     dl = DirectoryLoader(directory, "**/*.md")
      6     return dl.load()
----> 8 documents = find_md_files('../docs_sample/')
      9 len(documents)

Cell In[8], line 6, in find_md_files(directory)
      4 "Find all markdown files in a directory and return a LangChain Document"
      5 dl = DirectoryLoader(directory, "**/*.md")
----> 6 return dl.load()

File [~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/directory.py:133](https://file+.vscode-resource.vscode-cdn.net/Users/afxjzs/dev/stelthe/tryouts/wandb-llm-courses/llm-apps-course/notebooks/~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/directory.py:133), in DirectoryLoader.load(self)
    131 else:
    132     for i in items:
--> 133         self.load_file(i, p, docs, pbar)
    135 if pbar:
    136     pbar.close()

File [~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/directory.py:94](https://file+.vscode-resource.vscode-cdn.net/Users/afxjzs/dev/stelthe/tryouts/wandb-llm-courses/llm-apps-course/notebooks/~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/directory.py:94), in DirectoryLoader.load_file(self, item, path, docs, pbar)
     92         logger.warning(f"Error loading file {str(item)}: {e}")
     93     else:
---> 94         raise e
     95 finally:
     96     if pbar:

File [~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/directory.py:88](https://file+.vscode-resource.vscode-cdn.net/Users/afxjzs/dev/stelthe/tryouts/wandb-llm-courses/llm-apps-course/notebooks/~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/directory.py:88), in DirectoryLoader.load_file(self, item, path, docs, pbar)
     86 try:
     87     logger.debug(f"Processing file: {str(item)}")
---> 88     sub_docs = self.loader_cls(str(item), **self.loader_kwargs).load()
     89     docs.extend(sub_docs)
     90 except Exception as e:

File [~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/unstructured.py:86](https://file+.vscode-resource.vscode-cdn.net/Users/afxjzs/dev/stelthe/tryouts/wandb-llm-courses/llm-apps-course/notebooks/~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/unstructured.py:86), in UnstructuredBaseLoader.load(self)
     84 def load(self) -> List[Document]:
     85     """Load file."""
---> 86     elements = self._get_elements()
     87     if self.mode == "elements":
     88         docs: List[Document] = list()

File [~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/unstructured.py:171](https://file+.vscode-resource.vscode-cdn.net/Users/afxjzs/dev/stelthe/tryouts/wandb-llm-courses/llm-apps-course/notebooks/~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/langchain/document_loaders/unstructured.py:171), in UnstructuredFileLoader._get_elements(self)
    168 def _get_elements(self) -> List:
    169     from unstructured.partition.auto import partition
--> 171     return partition(filename=self.file_path, **self.unstructured_kwargs)

File [~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/unstructured/partition/auto.py:214](https://file+.vscode-resource.vscode-cdn.net/Users/afxjzs/dev/stelthe/tryouts/wandb-llm-courses/llm-apps-course/notebooks/~/.pyenv/versions/3.11.4/envs/jupyter_env/lib/python3.11/site-packages/unstructured/partition/auto.py:214), in partition(filename, content_type, file, file_filename, url, include_page_breaks, strategy, encoding, paragraph_grouper, headers, skip_infer_table_types, ssl_verify, ocr_languages, pdf_infer_table_structure, xml_keep_tags, data_source_metadata, **kwargs)
    207     elements = partition_rst(
    208         filename=filename,
    209         file=file,
    210         include_page_breaks=include_page_breaks,
    211         **kwargs,
    212     )
    213 elif filetype == FileType.MD:
--> 214     elements = partition_md(
    215         filename=filename,
    216         file=file,
    217         include_page_breaks=include_page_breaks,
    218         **kwargs,
    219     )
    220 elif filetype == FileType.PDF:
    221     elements = partition_pdf(
    222         filename=filename,  # type: ignore
    223         file=file,  # type: ignore
   (...)
    229         **kwargs,
    230     )

NameError: name 'partition_md' is not defined
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant