Monday, March 19, 2012

Help needed with importing XML

SQL server 2005
ive been trying very unsucessfully to try and import an xml file into a
SQL2005 table. I have a schema but am unsure as what to do next. I have been
looking at lots od different exmples but its way over my head. i need a step
by step example that will import the following xml file into two tables
<?xml version="1.0" encoding="ISO-8859-1"?>
<BACSDocument>
<Data>
<ARUCS>
<Header reportType="REFT1027" adviceNumber="01077"
currentProcessingDate="2005-12-05"></Header>
<AddresseeInformation name="Mr Bean "
address1="Company Name " address2="This Place "
address3="This Town " address4="This County
" address5="A12 45T "></AddresseeInformation>
</ARUCS>
</Data>
<SignatureMethod></SignatureMethod>
<Signature></Signature>
</BACSDocument>
table 1 will contain all the header information in fields , reportType,
advicenumber etc as all the address info into table 2.
Once i have created the tables, is there a way to just do the import direct
in TSQL
any help will be very welcome as im struggling to grasp this
Hi Peter,
I've made up an really simple table definition for table1 and table2 and
left some of the columns out to cut down on size. You want to decompose the
XML into relational name-value pairs and do an insert of pieces into two
separate tables. This is *assuming* you don't have more than one Header or
AddresseeInformation in the document, or more than one document in the XML.
Example follows mail message.
First way is the easiest. Use the xml.value method to extract each value,
given the attribute name in the document, 1 column per column in the rowset
(table) you want. This may be able to be optimized by changing the query,
but I'm trying to keep it simple for exposition.
Second way is to use xml.nodes method to obtain a rowset of name-value
pairs. Then use the PIVOT operator to pivot the values into one row and
multiple column values. I've done it in two steps (#temp table) to
illustrate what nodes returns, then combined it into one step.
You could also use OpenXML to do this, but it might require more storage
overhead.
Hope this helps,
Cheers,
Bob Beauchemin
http://www.SQLskills.com/blogs/bobb
CREATE TABLE table1 (
reporttype varchar(100),
adviceNumber varchar(100),
currentProcessingDate datetime
)
go
CREATE TABLE table2 (
name varchar(100),
address1 varchar(100),
address2 varchar(100)
)
go
declare @.x xml
set @.x =
'<?xml version="1.0" encoding="ISO-8859-1"?>
<BACSDocument>
<Data>
<ARUCS>
<Header reportType="REFT1027" adviceNumber="01077"
currentProcessingDate="2005-12-05"></Header>
<AddresseeInformation name="Mr Bean "
address1="Company Name " address2="This Place "
address3="This Town " address4="This County
" address5="A12 45T "></AddresseeInformation>
</ARUCS>
</Data>
<SignatureMethod></SignatureMethod>
<Signature></Signature>
</BACSDocument>
'
/* First way, using xml.value
INSERT table1
select @.x.value('(/BACSDocument/Data/ARUCS/Header)[1]/@.reportType',
'varchar(100)') as a,
@.x.value('(/BACSDocument/Data/ARUCS/Header)[1]/@.adviceNumber',
'varchar(100)') as b,
@.x.value('(/BACSDocument/Data/ARUCS/Header)[1]/@.currentProcessingDate',
'datetime') as c
SELECT * FROM table1
-- now do the same for table2
INSERT table2
select @.x.value('(/BACSDocument/Data/ARUCS/AddresseeInformation)[1]/@.name',
'varchar(100)') as a,
@.x.value('(/BACSDocument/Data/ARUCS/AddresseeInformation)[1]/@.address1',
'varchar(100)') as b,
@.x.value('(/BACSDocument/Data/ARUCS/AddresseeInformation)[1]/@.address2',
'varchar(100)') as c
SELECT * FROM table2
*/
/* Second way, using xml.nodes, intermediate table for exposition
select t.c.value('local-name(.)', 'varchar(50)') as Name,
t.c.value('data(.)', 'varchar(100)') as Value
into #temp
from @.x.nodes('/BACSDocument/Data/ARUCS/Header/@.*') as t(c)
select * from #temp
SELECT [reportType], [adviceNumber], [currentProcessingDate]
FROM #temp
PIVOT (
MAX([Value]) FOR
[Name] IN ([reportType], [adviceNumber], [currentProcessingDate])
) as p
-- now do the same for table2 (elided)
*/
-- third way, combination of second way into one statement.
insert table1
select [reportType], [adviceNumber], [currentProcessingDate]
from
(
SELECT t.c.value('local-name(.)', 'varchar(50)') as [Name],
t.c.value('data(.)', 'varchar(100)') as [Value]
FROM @.x.nodes('/BACSDocument/Data/ARUCS/Header/@.*') as t(c)
) AS namevalue
PIVOT (
MAX([Value]) FOR
[Name] IN ([reportType], [adviceNumber], [currentProcessingDate])
) as p
-- now do the same for table2 (elided)
"Peter Newman" <PeterNewman@.discussions.microsoft.com> wrote in message
news:E8722036-4DD5-4F15-A4FB-6CF9BF9044EF@.microsoft.com...
> SQL server 2005
> ive been trying very unsucessfully to try and import an xml file into a
> SQL2005 table. I have a schema but am unsure as what to do next. I have
> been
> looking at lots od different exmples but its way over my head. i need a
> step
> by step example that will import the following xml file into two tables
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <BACSDocument>
> <Data>
> <ARUCS>
> <Header reportType="REFT1027" adviceNumber="01077"
> currentProcessingDate="2005-12-05"></Header>
> <AddresseeInformation name="Mr Bean "
> address1="Company Name " address2="This Place
> "
> address3="This Town " address4="This County
> " address5="A12 45T "></AddresseeInformation>
> </ARUCS>
> </Data>
> <SignatureMethod></SignatureMethod>
> <Signature></Signature>
> </BACSDocument>
> table 1 will contain all the header information in fields , reportType,
> advicenumber etc as all the address info into table 2.
> Once i have created the tables, is there a way to just do the import
> direct
> in TSQL
> any help will be very welcome as im struggling to grasp this
|||Bob,
Thank you very much, that has showed me a lot and i have managed to adapt it
to the nomal xml files that im currently recieving, however, and theres
always a however, some of the files have more than one element of the ssame
name, so in the file i have shown earlier, how would i handle it if say it
had three AddresseeInformation for example
thansk in advance
"Bob Beauchemin" wrote:

> Hi Peter,
> I've made up an really simple table definition for table1 and table2 and
> left some of the columns out to cut down on size. You want to decompose the
> XML into relational name-value pairs and do an insert of pieces into two
> separate tables. This is *assuming* you don't have more than one Header or
> AddresseeInformation in the document, or more than one document in the XML.
> Example follows mail message.
> First way is the easiest. Use the xml.value method to extract each value,
> given the attribute name in the document, 1 column per column in the rowset
> (table) you want. This may be able to be optimized by changing the query,
> but I'm trying to keep it simple for exposition.
> Second way is to use xml.nodes method to obtain a rowset of name-value
> pairs. Then use the PIVOT operator to pivot the values into one row and
> multiple column values. I've done it in two steps (#temp table) to
> illustrate what nodes returns, then combined it into one step.
> You could also use OpenXML to do this, but it might require more storage
> overhead.
> Hope this helps,
> Cheers,
> Bob Beauchemin
> http://www.SQLskills.com/blogs/bobb
> CREATE TABLE table1 (
> reporttype varchar(100),
> adviceNumber varchar(100),
> currentProcessingDate datetime
> )
> go
> CREATE TABLE table2 (
> name varchar(100),
> address1 varchar(100),
> address2 varchar(100)
> )
> go
> declare @.x xml
> set @.x =
> '<?xml version="1.0" encoding="ISO-8859-1"?>
> <BACSDocument>
> <Data>
> <ARUCS>
> <Header reportType="REFT1027" adviceNumber="01077"
> currentProcessingDate="2005-12-05"></Header>
> <AddresseeInformation name="Mr Bean "
> address1="Company Name " address2="This Place "
> address3="This Town " address4="This County
> " address5="A12 45T "></AddresseeInformation>
> </ARUCS>
> </Data>
> <SignatureMethod></SignatureMethod>
> <Signature></Signature>
> </BACSDocument>
> '
> /* First way, using xml.value
> INSERT table1
> select @.x.value('(/BACSDocument/Data/ARUCS/Header)[1]/@.reportType',
> 'varchar(100)') as a,
> @.x.value('(/BACSDocument/Data/ARUCS/Header)[1]/@.adviceNumber',
> 'varchar(100)') as b,
> @.x.value('(/BACSDocument/Data/ARUCS/Header)[1]/@.currentProcessingDate',
> 'datetime') as c
> SELECT * FROM table1
> -- now do the same for table2
> INSERT table2
> select @.x.value('(/BACSDocument/Data/ARUCS/AddresseeInformation)[1]/@.name',
> 'varchar(100)') as a,
> @.x.value('(/BACSDocument/Data/ARUCS/AddresseeInformation)[1]/@.address1',
> 'varchar(100)') as b,
> @.x.value('(/BACSDocument/Data/ARUCS/AddresseeInformation)[1]/@.address2',
> 'varchar(100)') as c
> SELECT * FROM table2
> */
>
> /* Second way, using xml.nodes, intermediate table for exposition
> select t.c.value('local-name(.)', 'varchar(50)') as Name,
> t.c.value('data(.)', 'varchar(100)') as Value
> into #temp
> from @.x.nodes('/BACSDocument/Data/ARUCS/Header/@.*') as t(c)
> select * from #temp
> SELECT [reportType], [adviceNumber], [currentProcessingDate]
> FROM #temp
> PIVOT (
> MAX([Value]) FOR
> [Name] IN ([reportType], [adviceNumber], [currentProcessingDate])
> ) as p
> -- now do the same for table2 (elided)
> */
> -- third way, combination of second way into one statement.
> insert table1
> select [reportType], [adviceNumber], [currentProcessingDate]
> from
> (
> SELECT t.c.value('local-name(.)', 'varchar(50)') as [Name],
> t.c.value('data(.)', 'varchar(100)') as [Value]
> FROM @.x.nodes('/BACSDocument/Data/ARUCS/Header/@.*') as t(c)
> ) AS namevalue
> PIVOT (
> MAX([Value]) FOR
> [Name] IN ([reportType], [adviceNumber], [currentProcessingDate])
> ) as p
> -- now do the same for table2 (elided)
> "Peter Newman" <PeterNewman@.discussions.microsoft.com> wrote in message
> news:E8722036-4DD5-4F15-A4FB-6CF9BF9044EF@.microsoft.com...
>
>
|||Hi Peter,
Those examples showed how to decompose arbitrary XML in multiple unrealated
tables. In a relational database you need to have something tying together
table1 and table2 (adviseNumber?, reportType?). There's nothing in the
document to deduce this. Also, relational doesn't allow repeating groups, so
you'd need a discriminator to distinguish between the 3 nodes. You could
either use ordinal (as I used [1] in the first example to indicate the 1st
AddresseeInformation) or use nodes to do it in one step and insert multiple
rows. But there has to be something in the relational schema tying table1
and table2 together.
Bob Beauchemin
http://www.SQLskills.com/blogs/bobb
"Peter Newman" <PeterNewman@.discussions.microsoft.com> wrote in message
news:3D443871-A973-4407-96EE-5E44EA309BBF@.microsoft.com...[vbcol=seagreen]
> Bob,
> Thank you very much, that has showed me a lot and i have managed to adapt
> it
> to the nomal xml files that im currently recieving, however, and theres
> always a however, some of the files have more than one element of the
> ssame
> name, so in the file i have shown earlier, how would i handle it if say
> it
> had three AddresseeInformation for example
> thansk in advance
> "Bob Beauchemin" wrote:
|||Bob,
Ive spent he entire day chasing ghosts tying to assing a discriminator , its
easy to code when i know how meny there will be but as these reports are
dynamic i need to find a way to change rowcount to equal the ordinal, and
loop till all the rows have been imported
SELECT t.c.value('local-name(.)', 'varchar(50)') AS Name,
t.c.value('data(.)', 'Varchar(100)') as Value, '1' as RowNumber
INTO #ReturnedItem
FROM
@.XMLDOC.nodes('/BACSDocument/Data/ARUCS/Advice/OriginatingAccountRecords/OriginatingAccountRecord/ReturnedCreditItem[1]/@.*') as t(c)
youve been such a great help so far, think one i can resolve this issue i
can carry on on my own
"Bob Beauchemin" wrote:

> Hi Peter,
> Those examples showed how to decompose arbitrary XML in multiple unrealated
> tables. In a relational database you need to have something tying together
> table1 and table2 (adviseNumber?, reportType?). There's nothing in the
> document to deduce this. Also, relational doesn't allow repeating groups, so
> you'd need a discriminator to distinguish between the 3 nodes. You could
> either use ordinal (as I used [1] in the first example to indicate the 1st
> AddresseeInformation) or use nodes to do it in one step and insert multiple
> rows. But there has to be something in the relational schema tying table1
> and table2 together.
> Bob Beauchemin
> http://www.SQLskills.com/blogs/bobb
>
> "Peter Newman" <PeterNewman@.discussions.microsoft.com> wrote in message
> news:3D443871-A973-4407-96EE-5E44EA309BBF@.microsoft.com...
>
>
|||If I think I'm understanding what you're asking, there's a few ways to do
this. You could add a gratuitous identity column to #temp and use
INSERT...SELECT instead of SELECT INTO. You could loop using a T-SQL
variable until the nodes function returns no nodes, using sql:variable in
the XQuery predicate. You could have also changed the XPath expression to a
FLWOR expression and selected the position, but SQL Server XQuery doesn't
support the "at" portion of "for $x at $y in ..." syntax or the position()
function used outside of the predicate.
Hope this helps,
Bob Beauchemin
http://www.SQLskills.com/blogs/bobb
"Peter Newman" <PeterNewman@.discussions.microsoft.com> wrote in message
news:F327D2A5-07D4-4F37-82BA-23B4BEA60226@.microsoft.com...[vbcol=seagreen]
> Bob,
> Ive spent he entire day chasing ghosts tying to assing a discriminator ,
> its
> easy to code when i know how meny there will be but as these reports are
> dynamic i need to find a way to change rowcount to equal the ordinal, and
> loop till all the rows have been imported
> SELECT t.c.value('local-name(.)', 'varchar(50)') AS Name,
> t.c.value('data(.)', 'Varchar(100)') as Value, '1' as RowNumber
> INTO #ReturnedItem
> FROM
> @.XMLDOC.nodes('/BACSDocument/Data/ARUCS/Advice/OriginatingAccountRecords/OriginatingAccountRecord/ReturnedCreditItem[1]/@.*')
> as t(c)
>
> youve been such a great help so far, think one i can resolve this issue i
> can carry on on my own
> "Bob Beauchemin" wrote:
|||Bob,
Thanks again for the pointers yhoi have to admit ive spend a few hours and
srtill carnt grasp it. I understand what you were saying about the
relationship between table 1 and table 2, that i think i can work out, what
im still struggling with is returning the pivot table for the three addresss
( only for example ). I still can not get the identity colum in using your
previous eamaples. sorry to be a pain but can you pint me in the direction
of an example based of what you have already explained
thanks again for all your help
"Bob Beauchemin" wrote:

> If I think I'm understanding what you're asking, there's a few ways to do
> this. You could add a gratuitous identity column to #temp and use
> INSERT...SELECT instead of SELECT INTO. You could loop using a T-SQL
> variable until the nodes function returns no nodes, using sql:variable in
> the XQuery predicate. You could have also changed the XPath expression to a
> FLWOR expression and selected the position, but SQL Server XQuery doesn't
> support the "at" portion of "for $x at $y in ..." syntax or the position()
> function used outside of the predicate.
> Hope this helps,
> Bob Beauchemin
> http://www.SQLskills.com/blogs/bobb
> "Peter Newman" <PeterNewman@.discussions.microsoft.com> wrote in message
> news:F327D2A5-07D4-4F37-82BA-23B4BEA60226@.microsoft.com...
>
>

No comments:

Post a Comment