Replies: 3 comments 1 reply
-
Really it's find and/or walk that needs to be the generator. I would be supportive of ifind and iglob methods. It's doable, but would need some careful refactoring, because various implementations override some of the methods. If the size of the glob output a bottleneck for you? Or is it the upfront time spent listing files? I would be surprised if either is significant compared to kerchunk-scanning each file. |
Beta Was this translation helpful? Give feedback.
-
I don't have issues with lists now really. I'm just thinking if we want a process to kerchunk a very large number of files it would be nice to pipe files directly to kerchunk without the memory overhead of a potentially huge list. So it's not a matter of time generating the list; rather just be able to pipeline things. Makes sense ? |
Beta Was this translation helpful? Give feedback.
-
Hi,
How easy would it be to get a generator back from glob instead of a list ? iglob ? Some arg to the current fsspec glob implementation ?
In the context of pipelines (I'm thinking kerchunk) that could be an interesting feature.
Thanx !
Beta Was this translation helpful? Give feedback.
All reactions