你好，我想咨询一下用到的数据源是如何来的 #9

MOONSky1996 · 2020-01-06T07:41:13Z

你好，非常感谢您的分享，我想利用您发布的代码进行姓名消歧，您的代码我已经跑通了，也看过了那片论文，但是还是不确定用到的数据源是怎么生成的，所以想咨询一下，比如pubs_raw.json和name_to_pubs_test_100.json这两个文件是如何生成的，不尽感激！

LynnnnnnnnnnN · 2020-04-07T09:31:35Z

你好，非常感谢您的分享，我想利用您发布的代码进行姓名消歧，您的代码我已经跑通了，也看过了那片论文，但是还是不确定用到的数据源是怎么生成的，所以想咨询一下，比如pubs_raw.json和name_to_pubs_test_100.json这两个文件是如何生成的，不尽感激！

同学请问一下你在运行processing.py时有遇到这样的问题么：

Traceback (most recent call last):
File "scripts/preprocessing.py", line 112, in
dump_author_features_to_cache()
File "scripts/preprocessing.py", line 54, in dump_author_features_to_cache
lc.set(pid_order, author_features)
File "/mnt/d/Study/WSD/others/disambiguation-master/utils/cache.py", line 39, in set
txn.put(key.encode("utf-8"), data_utils.serialize_embedding(vector))
lmdb.CorruptedError: mdb_put: MDB_CORRUPTED: Located page was wrong type
现在还有点不知所措

MOONSky1996 · 2020-04-08T01:27:27Z

你好，非常感谢您的分享，我想利用您发布的代码进行姓名消歧，您的代码我已经跑通了，也看过了那片论文，但是还是不确定用到的数据源是怎么生成的，所以想咨询一下，比如pubs_raw.json和name_to_pubs_test_100.json这两个文件是如何生成的，不尽感激！

同学请问一下你在运行processing.py时有遇到这样的问题么：

Traceback (most recent call last):
File "scripts/preprocessing.py", line 112, in
dump_author_features_to_cache()
File "scripts/preprocessing.py", line 54, in dump_author_features_to_cache
lc.set(pid_order, author_features)
File "/mnt/d/Study/WSD/others/disambiguation-master/utils/cache.py", line 39, in set
txn.put(key.encode("utf-8"), data_utils.serialize_embedding(vector))
lmdb.CorruptedError: mdb_put: MDB_CORRUPTED: Located page was wrong type
现在还有点不知所措

这个错误没有遇到呢，这个mdb的问题很有可能是因为没在Linux上跑出的错误，不知道同学你是否实在Linux上跑的代码，如果不是的话你先试一下在Linux虚拟机上跑一下。

LynnnnnnnnnnN · 2020-04-08T02:16:37Z

你好，非常感谢您的分享，我想利用您发布的代码进行姓名消歧，您的代码我已经跑通了，也看过了那片论文，但是还是不确定用到的数据源是怎么生成的，所以想咨询一下，比如pubs_raw.json和name_to_pubs_test_100.json这两个文件是如何生成的，不尽感激！

同学请问一下你在运行processing.py时有遇到这样的问题么：

Traceback (most recent call last):
File "scripts/preprocessing.py", line 112, in
dump_author_features_to_cache()
File "scripts/preprocessing.py", line 54, in dump_author_features_to_cache
lc.set(pid_order, author_features)
File "/mnt/d/Study/WSD/others/disambiguation-master/utils/cache.py", line 39, in set
txn.put(key.encode("utf-8"), data_utils.serialize_embedding(vector))
lmdb.CorruptedError: mdb_put: MDB_CORRUPTED: Located page was wrong type
现在还有点不知所措

这个错误没有遇到呢，这个mdb的问题很有可能是因为没在Linux上跑出的错误，不知道同学你是否实在Linux上跑的代码，如果不是的话你先试一下在Linux虚拟机上跑一下。

谢谢大大回复！
我是在Windows10的内置Ubuntu上跑的，那我再试试丢虚拟机里跑

sanlunainiu · 2023-10-12T01:38:02Z

你好，请问你的问题解决了么？

sanlunainiu · 2023-10-12T01:38:32Z

你好，非常感谢您的分享，我想利用您发布的代码进行姓名消歧，您的代码我已经跑通了，也看过了那片论文，但是还是不确定用到的数据源是怎么生成的，所以想咨询一下，比如pubs_raw.json和name_to_pubs_test_100.json这两个文件是如何生成的，不尽感激！

你好，请问你的问题解决了么？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

你好，我想咨询一下用到的数据源是如何来的 #9

你好，我想咨询一下用到的数据源是如何来的 #9

MOONSky1996 commented Jan 6, 2020 •

edited

Loading

LynnnnnnnnnnN commented Apr 7, 2020

MOONSky1996 commented Apr 8, 2020

LynnnnnnnnnnN commented Apr 8, 2020

sanlunainiu commented Oct 12, 2023

sanlunainiu commented Oct 12, 2023

你好，我想咨询一下用到的数据源是如何来的 #9

你好，我想咨询一下用到的数据源是如何来的 #9

Comments

MOONSky1996 commented Jan 6, 2020 • edited Loading

LynnnnnnnnnnN commented Apr 7, 2020

MOONSky1996 commented Apr 8, 2020

LynnnnnnnnnnN commented Apr 8, 2020

sanlunainiu commented Oct 12, 2023

sanlunainiu commented Oct 12, 2023

MOONSky1996 commented Jan 6, 2020 •

edited

Loading