Wondering if some interfaces in the processing pipeline are not too lean #683
dziedrius
started this conversation in
2. Feature requests
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
While looking at the processing pipeline, I was wondering if current interfaces are not too restrictive on output.
For example
IOcrEngine
return string only:Usually Ocr is paid service, hence if I'm using it and it provides more data (like recognized text positions, which can be used later to censor out pieces of text, paragraphs, etc.) - I would rather not discard that if possible, to avoid calling service second time.
Similarly
IContentDecoder
:If I parsed the file and have some data on my hands, that should not go to file content (for example image EXIF metadata or office file metadata) - but maybe to tags or payload - would be nice not to process it once again.
Maybe I'm overthinking, would be interested what others think :)
Beta Was this translation helpful? Give feedback.
All reactions