Skip to content

Extracts a XML document between a set of tags and transforms it into single line document

Notifications You must be signed in to change notification settings

nebuxadnezzar/xml-tag-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xml-tag-extractor

Extracts a XML document between a set of tags and transforms it into single line document

Command line


Usage: xte [OPTIONS] XML-file-path
OPTIONS:
  -boost
    	boost processing, helps big file processing
  -ca
  -convert.attributes
    	convert attributes to elements
  -h
  -help
    	show help
  -ol
  -one.liner
    	transform XML document into one-liner

  -rt string
  -root.tag string
    	add provided root tags to each document to make correct XML document

  -xp string
  -xml.paths string
    	CSV list of paths to tag(s) to extract, i.e. root:greeting OR root:greetings,root:story

Examples

given this file

<root>
    <greetings>hello</greetings>
    <greetings>good bye <times>3</times>
    </greetings>
    <greetings id="123"/>
    <smiles>wide</smiles>
</root>

... to extract greetings tags together with its content use command

xte -ol -xp root:greetings my.xml

the output will be one-document-per-line records

<greetings>hello</greetings>
<greetings>good bye <times>3</times></greetings>
<greetings id="123"/>

to extract times tags together with its content use command

xte xte -ol -xp root:greetings:times my.xml

if not tag-path argument provided xte will print all tag paths and their count found in an XML file which can be useful, if you have huge file and don't know XML structure of the file.

xte my.xml
root		1
root:greetings		3
root:greetings:times		1
root:smiles		1

Notes

xte doesn't parse out and convert attributes to elements. this will be next feature

Releases

No releases published

Packages

No packages published