I dont want to waste anybodys time here but i will upload a sample of my xml file and the code i am trying to use to parse it, if anybody can tell me what im doing wrong i would be insanely grateful
xml sample:
<LineItem count="1">
<PurchaseOrderUnit>
<PurchaseOrderUnitId identType="VIN" ident="021360"/>
<PurchaseOrderUnitQty UOMBasis="pack">10</PurchaseOrderUnitQty>
<BuyersCost UOMBasis="pack">12.1200</BuyersCost>
<Taxes taxable="No"/>
</PurchaseOrderUnit>
<RetailUnitPricing>
<RetailUnitId identType="GTIN" ident="00005200003940">SINGLE</RetailUnitId>
<RetailUnitQty identType="GTIN" ident="00005200003940" UOMBasis="each">1.000</RetailUnitQty>
<RetailPrice>1.69</RetailPrice>
</RetailUnitPricing>
</LineItem>
I am trying to extract the PurchaseOrderUnitId ident attrib number, and the PurchaseOrderUnitQty, they eventually need formatted like this 6 digit id number/+Qty. But right now im just trying to pull the numbers out of the xml file
With this code:
import xml.etree.cElementTree as ET
tree = ET.ElementTree(file="c:\\users\\design\\desktop\\scripttest\\newsample.xml")
root = tree.getroot()
for PurchaseOrderUnit in root.findall('PurchaseOrderUnit'):
qty = PurchaseOrderUnit.findall('PurchaseOrderUnitQty').text
id = PurchaseOrderUnit.get('PurchaseOrderUnitId')
print id, qty
when i run it from the command prompt it doesnt output anything, no errors or nothing.
10 Responses
ok firstly the best practice as beginner with xml is to put the xml file in the same folder as the .py file, to avoid errors when moving your project to other pcs, and using minidom would be the best to parse your xml
so use this
file = minidom.parse("sample.xml")
and use a for statment for each attributes you want to get, and nodelists should be printed using () mean print(id) print(qty) if im not wrong,
so your final .py file could be something like that:
from xml.dom import minidom
import itertools
file = minidom.parse("sample.xml")
id = file.getElementsByTagName("PurchaseOrderUnitId")
qty = file.getElementsByTagName("PurchaseOrderUnitQty")
for i in id:
pass
for j in qty:
pass
id = (i.attributes"ident".value)
qty = j.firstChild.nodeValue
print (id)
print (qty)
Hacked by Mr_Nakup3nda
Hope this will help you...python xml parsing
Omg thank you for responding MR! Funny thing is i did have that sample in with the script file i'm not sure why i was using the full path. Im trying your code and it keeps throwing a syntax error AttributeError: Element instance has no attribute. Any ideas?
before you proceed make sure to change your xml file name as "sample.xml" or change the following line to your xml file name
file = minidom.parse("sample.xml")
then make sure its in the same folder as your python file..
hacked by Mr_Nakup3nda
I fixed that and i got around the syntax error and it works, but it doesnt iterate through and pull all of the UnitId and Qty. It just pulls one
Again thank you
could u be more explicit about wht u r trying to do? like the output u want?
in case you want to output as u said above (hey eventually need formatted like this 6 digit id number/+Qty)the code is this:
from xml.dom import minidom
import itertools
file = minidom.parse("sample.xml")
id = file.getElementsByTagName("PurchaseOrderUnitId")
qty = file.getElementsByTagName("PurchaseOrderUnitQty")
for i in id:
pass
for j in qty:
pass
id = (i.attributes"ident".value)
qty = j.firstChild.nodeValue
print (id)
print (qty)
print (id,"/+",qty)
Hacked by Mr_Nakup3nda
yea as far as formatting goes that is the ouput i needed. The xml file that i will be working with will have multiple lines of unitid numbers and qtys that i would need to extract. right now that code only extracts one number and its qty, if that makes sense.
glad that i could help you, try to add more unit and qty to the xml file, if u fiund any problem just let me know, but try to mess around by yourself. Take care of your code, its a good practice to comment your code
Hacked by Mr_Nakup3nda
The sample file that i am actually using has about 7 of each i just didnt put the entire file on here. one thing i noticed is if i do:
for i in id:
print (i.attributes"ident".value)
it iterates through all the items in the sample. its when i pass it to the next for statement and try to print them both together where it doesnt print the iteration. Please excuse my ignorance, i have takin just minor courses in python and programming in general.
Ok this code is almost there:
from xml.dom import minidom
from itertools import imap
xmldoc = minidom.parse("newsample.xml")
itemlist = xmldoc.getElementsByTagName("PurchaseOrderUnitId")
quantity = xmldoc.getElementsByTagName("PurchaseOrderUnitQty")
for i in itemlist:
for s in quantity:
print(i.attributes"ident".value),
print s.firstChild.nodeValue
which outputs:
021360 10
021360 2
021360 10
021360 5
021360 10
021360 15
021360 6
023408 10
023408 2
023408 10
023408 5
023408 10
023408 15
023408 6
064014 10
064014 2
064014 10
064014 5
064014 10
064014 15
064014 6
the problem is it copies the item number 6 times and just recycles the qty so they dont match the item that they go to. essentially what i need is:
021360 10
023408 2
064014 10
Share Your Thoughts