Working With XML in Scala
Scala makes it pretty easy to work with XML. Take a look at how you can create XML documents, parse them, read them, and store them.
Join the DZone community and get the full member experience.
Join For FreeIn this blog, we will talk about how we can work with XML using Scala.
Scala treats XML as a first-class citizen. So, instead of embedding XML documents into strings, you can place them inline in your code like you would place an int or double value.
For example:
scala> val xml = <greet>Hello</greet>
xml: scala.xml.Elem = <greet>Hello</greet>
scala> xml.getClass
res2: Class[_ <: scala.xml.Elem] = class scala.xml.Elem
We have created a val named xml and assigned sample XML content to it. Scala parses it and creates an instance of scala.xml.Elem.
The Scala package scala.xml provides classes to create XML documents, parse them, read them, and store them.
Let’s see how we can parse it. XPath is a powerful tool to query XML documents. Scala provides an XPath-like query ability with a slight difference. In XPath, we use forward slashes “/” and “//” to query the XML documents. But in Scala, “/” is a division operator and “//” is one of the ways to comment code. Scala prefers to use backward slashes “\” and “\\” to query the XML document.
For example:
scala> val xmlDoc = <symbols>
<symbol ticker="Cisco" ><units>100</units></symbol>
<symbol ticker="Sandisk" ><units>315</units></symbol>
</symbols>
We would like to get the symbol elements. Wecan use the XPath query:
scala> val children = xmlDoc \ "symbol"
scala> children: scala.xml.NodeSeq = NodeSeq(<symbol ticker="Cisco"><units>100</units></symbol>, <symbol ticker="Sandisk"><units>315</units></symbol>)
We called the \() on the XML element and asked it to look for all symbol elements. It returns an instance of scala.xml.NodeSeq, which represents a collection of XML nodes.
The \() method looks only for the elements that are direct descendants of the target element (i.e. symbol). If we want to search through all the elements in the hierarchy starting from the target element, the \\() method is used:
val grandChildren = xmlDoc \\ "units"
grandChildren: scala.xml.NodeSeq = NodeSeq(<units>100</units>, <units>315</units>)
And we can use the text method to get the text node within an element. Let see another example:
scala> val document = <languages>
<language>Scala</language>
<language>Java</language>
<language>C++</language>
<language>Kotlin</language>
</languages>
scala> val children = (doc \ "language")
children: scala.xml.NodeSeq = NodeSeq(<language>Scala</language>, <language>Java</language>, <language>C++</language>, <language>Kotlin</language>)
We can iterate through the NodeSeq and print the node using text:
scala> children.foreach(child => println(child.text))
Scala
Java
C++
Kotlin
There is also a child method to get all the children of the root element:
scala> val children = doc.child
children: Seq[scala.xml.Node] =
ArrayBuffer(
, <language>Scala</language>,
, <language>Java</language>,
, <language>C++</language>,
, <language>Kotlin</language>,
)
Want to extract the attributes from XML elements?
scala> val xmlLanguage = <languages>
| <language platform = "jvm">Scala</language>
| <language platform = "jvm">Java</language>
| <language platform = "clr">C#</language>
| </languages>
scala> val attributes = xmlLanguage \\ "@platform"
attributes: scala.xml.NodeSeq = NodeSeq(jvm, jvm, clr)
scala> attributes.foreach(e => println(e))
jvm
jvm
clr
And what about loading an XML document from a local source?
XML class provides a load method, which takes the path as a parameter and loads the XML file into memory.
import scala.xml._
val xmlFile = XML.load(path)
It will load the XML content in memory and we can easily parse it or manipulate it.
Published at DZone with permission of Mahesh Chand, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments