What is new in XSLT 2.0 :
Given any XML document, produce a list of the words that appear in its text.
Giving the number of times each word appears, together with its’ frequency.
frequency.xml
<?xml version="1.0" encoding="iso-8859-1"?> <?xml-stylesheet type="text/xsl" href="frequency.xsl"?> <Sample> <TITLE>a a a a a</TITLE> <TITLE>1 1 1 1 1</TITLE> <TITLE>A A A A A</TITLE> <TITLE>! ! ! ! !</TITLE> <TITLE>b b b b b</TITLE> </Sample>
frequency.xsl
<?xml version="1.0" encoding="iso-8859-1"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="yes"/> <xsl:template match="/"> <wordcount> <xsl:for-each-group group-by="." select=" for $w in //text()/tokenize(., 'W+')[.!=''] return lower-case($w)"> <xsl:sort select="count(current-group())" order="descending"/> <word word="{current-grouping-key()}" frequency="{count(current-group())}"/> </xsl:for-each-group> </wordcount> </xsl:template> </xsl:stylesheet>
How to run :
c:saxon>java -jar C:saxonsaxon9ee.jar -a -s:sample/frequency.xml -o:sample/frequency.html
The output html file should be as given
frequency.html
<?xml version="1.0" encoding="UTF-8"?> <wordcount> <word word="a" frequency="10"/> <word word="1" frequency="5"/> <word word="b" frequency="5"/> </wordcount>
In xsl, we are only considering words not special characters.
We have not mentioned case sensitivity, that is all capital & small characters are counted as same.
Come to the new features added in XSLT 2.0:
1) <xsl:for-each-group> instruction is new in XSLT 2.0
2) tokenize() function also new in XSLT 2.0
3) tokenize() function uses regular expressions <<W+>>, this is new in XSLT 2.0 and XPath 2.0
4) lower-case() function also new in XSLT 2.0
5) The avg() function too new in XPath 2.0