Search This Site
Feb 04 2012 16:17 UTC | ||||||||||||||||||||||||||||||||||||||
Tutorials
XML ManagerQuestions
Got a question for us? Quality
Bookmarks |
Java Beans and Large Files ExampleSummaryThis is a longer example showing you how to:
The DataWe will use a small XML file as our initial example data, before we look at working with large files. This file contains a list of the projects and employees in a company, and indicates which employee is in which project. company.xml
<company>
<project num="1" name="Foo">
<deadline>2006-10-20</deadline>
<employees>1:2:3</employees>
</project>
<project num="2" name="Bar">
<deadline>2006-11-15</deadline>
<employees>3:4:5</employees>
</project>
<employee num="1" manager="false">
<name>John Doe</name>
<hired>2005-01-01</hired>
<projects>1</projects>
</employee>
<employee num="2" manager="false">
<name>Jane Doe</name>
<hired>2005-02-02</hired>
<projects>1</projects>
</employee>
<employee num="3" manager="true">
<name>Mr. Manager</name>
<hired>2005-03-03</hired>
<projects>1:2</projects>
</employee>
<employee num="4" manager="false">
<name>Joe Bloggs</name>
<hired>2005-04-04</hired>
<projects>2</projects>
</employee>
<employee num="5" manager="false">
<name>Jane Bloggs</name>
<hired>2005-05-05</hired>
<projects>2</projects>
</employee>
</company>
The data contains some special formats that we need to deal with. First the dates are given in a format of YYYY-MM-DD,
so we'll have to parse them using a DateFormat. Also, the We want to display this data as a set of HTML tables showing the projects and employees, and who is assigned to what. Here is the output we want:
To get the employee data we will use the following XPath expressions;
To get the project data we will use the following XPath expressions;
The BeansHere are the beans that we want to use:
public class Employee {
private int iNumber = 0;
private String iName = "";
private Date iHireDate = new Date();
private boolean iManager = false;
private int[] iProjectNumbers = new int[] {};
public int getNumber() {
return iNumber;
}
public void setNumber( int pNumber ) {
iNumber = pNumber;
}
public String getName() {
return iName;
}
public void setName( String pName ) {
iName = pName;
}
public Date getHireDate() {
return iHireDate;
}
public void setHireDate( Date pHireDate ) {
iHireDate = pHireDate;
}
public boolean isManager() {
return iManager;
}
public void setManager( boolean pManager ) {
iManager = pManager;
}
public int[] getProjectNumbers() {
return iProjectNumbers;
}
public void setProjectNumbers( int[] pProjectNumbers ) {
iProjectNumbers = pProjectNumbers;
}
public int getProjectNumber( int pIndex ) {
return iProjectNumbers[pIndex];
}
public void setProjectNumber( int pIndex, int pNumber ) {
iProjectNumbers[pIndex] = pNumber;
}
}
public class Project {
private int iNumber = 0;
private String iName = "";
private Date iDeadline = new Date();
private int[] iEmployeeNumbers = new int[] {};
public int getNumber() {
return iNumber;
}
public void setNumber( int pNumber ) {
iNumber = pNumber;
}
public String getName() {
return iName;
}
public void setName( String pName ) {
iName = pName;
}
public Date getDeadline() {
return iDeadline;
}
public void setDeadline( Date pDeadline ) {
iDeadline = pDeadline;
}
public int[] getEmployeeNumbers() {
return iEmployeeNumbers;
}
public void setEmployeeNumbers( int[] pEmployeeNumbers ) {
iEmployeeNumbers = pEmployeeNumbers;
}
public int getEmployeeNumber( int pIndex ) {
return iEmployeeNumbers[pIndex];
}
public void setEmployeeNumber( int pIndex, int pNumber ) {
iEmployeeNumbers[pIndex] = pNumber;
}
}
And here are the
// This is the XPath to extract the employee details.
RecordSpec rs_employee
= new RecordSpec("/company/employee",
new String[]{"'employee'","@num","@manager","name","hired","projects"},
new String[]{"","Number","Manager","Name","HireDate","ProjectNumbers"});
// This is the XPath to extract the project details.
RecordSpec rs_project
= new RecordSpec("/company/project",
new String[]{"'project'","@num","@name","deadline","employees"},
new String[]{"","Number","Name","Deadline","EmployeeNumbers"});
The first data field in each of these
Notice also that we use the bean property names as the data field names. The The final thing that we need is a BeanSpec bs_employee = new BeanSpec( Employee.class );
For our case, we need to provide a bit more information, because we are using a custom date field. What we need
is a way to tell XML Manager how to convert between a DateConverter.javaThe converter actually inherits from the Loading the DataOK, now that we've got all the pieces, let's assemble them. Normally when you use XML Manager to load Java Beans,
you just call the We can do this by writing a custom MultipleBeansRecordListener.java
public class MultipleBeansRecordListener extends RecordListenerSupport {
protected HashMap iBeanListMap = new HashMap();
protected HashMap iBeanFieldsMap = new HashMap();
protected HashMap iBeanSpecMap = new HashMap();
protected boolean iUseDefault = false;
public void addBeanSpec( String pCodeName, BeanSpec pBeanSpec, RecordSpec pRecordSpec ) {
iBeanSpecMap.put( pCodeName, pBeanSpec );
iBeanFieldsMap.put( pCodeName, pRecordSpec.getFieldNames() );
iBeanListMap.put( pCodeName, new ArrayList() );
}
public List getBeans( String pCodeName) {
return (List) iBeanListMap.get( pCodeName );
}
protected void setXmlSpecImpl( XmlSpec pXmlSpec ) {
iUseDefault = pXmlSpec.getBooleanProperty( BeanRecordListener.PROP_Bean_useDefault );
}
protected BadRecord handleRecordImpl( String[] pRecord, long pRecordNumber ) throws Exception {
String codename = pRecord[0];
if( iBeanSpecMap.containsKey( codename ) ) {
BeanSpec bs = (BeanSpec) iBeanSpecMap.get( codename );
String[] fieldnames = (String[]) iBeanFieldsMap.get( codename );
Object bean = bs.getBeanClass().newInstance();
for( int fI = 1; fI < fieldnames.length; fI++ ) {
bs.setStringValue( bean, fieldnames[fI], pRecord[fI], iUseDefault );
}
ArrayList beanlist = (ArrayList) iBeanListMap.get( codename );
beanlist.add( bean );
return null;
}
else {
return new BadRecord( pRecordNumber, pRecord, "unknown code: "+codename );
}
}
}
This is the core class of our Java Bean reader. The most important line is: bs.setStringValue( bean, fieldnames[fI], pRecord[fI], iUseDefault );
This uses the Notice that we start from the second element of the Once we have set all the bean properties, we add the new bean object to the correct list of beans of that type. We store
the different bean lists in the If the code is not recognised, that is, if a Let's tie everything together. Here is the
public void make( File pCompanyXmlFile ) throws Exception {
XmlManager xmlman = new XmlManager();
DateConverter dc = new DateConverter( sDateInputFormat );
RecordSpec rs_project
= new RecordSpec("/company/project",
new String[]{"'project'","@num","@name","deadline","employees"},
new String[]{"","Number","Name","Deadline","EmployeeNumbers"});
HashMap project_stringconv = new HashMap();
project_stringconv.put( "Deadline", dc );
BeanSpec bs_project = new BeanSpec( Project.class, project_stringconv );
RecordSpec rs_employee
= new RecordSpec("/company/employee",
new String[]{"'employee'","@num","@manager","name","hired","projects"},
new String[]{"","Number","Manager","Name","HireDate","ProjectNumbers"});
HashMap employee_stringconv = new HashMap();
employee_stringconv.put( "HireDate", dc );
BeanSpec bs_employee = new BeanSpec( Employee.class, employee_stringconv );
MultipleBeansRecordListener mb = new MultipleBeansRecordListener();
mb.addBeanSpec( "project", bs_project, rs_project );
mb.addBeanSpec( "employee", bs_employee, rs_employee );
xmlman.load( pCompanyXmlFile, new RecordSpec( rs_project, rs_employee ), mb );
List employees = mb.getBeans("employee");
List projects = mb.getBeans("project");
}
The source code above is simplified so that you can see the flow more clearly. The full version is in the
MakeTable.java file. Notice that we associate the
All that this method does is assemble the various utility classes and call XML Manager's xmlman.load( pCompanyXmlFile, new RecordSpec( rs_project, rs_employee ), mb );
We use one of the RecordSpec convenience constructors so that XML Manager will know to use both We are left with two
To get this example to run, just compile all the java files in the Streaming the DataWell this is all very nice, but the example The answer is to stream the data. By this we mean that we will not load all the data into memory at once. Instead we will load each data record one at a time, and output each HTML table row one at a time. We will never need to store more than one row in memory. Our current example just generates HTML, but you can also use this technique for loading large volumes of data into databases, or for handling large volumnes of web service transactions. The HTML generated from the streaming solution does have one drawback. The cross-reference links will have to use numbers, not the names of the employees, because we cannot look ahead into the file to records when have not yet been parsed. As always, speed requires trade-offs. To create a streaming solution, we subclass the StreamingMultipleBeansRecordListener.java
public class StreamingMultipleBeansRecordListener extends MultipleBeansRecordListener {
protected PrintWriter iPrintWriter = null;
protected boolean iLastWasEmployee = false;
protected boolean iFirst = true;
public StreamingMultipleBeansRecordListener( PrintWriter pPrintWriter ) {
iPrintWriter = pPrintWriter;
}
protected BadRecord handleRecordImpl( String[] pRecord, long pRecordNumber ) throws Exception {
String codename = pRecord[0];
if( iBeanSpecMap.containsKey( codename ) ) {
BeanSpec bs = (BeanSpec) iBeanSpecMap.get( codename );
String[] fieldnames = (String[]) iBeanFieldsMap.get( codename );
Object bean = bs.getBeanClass().newInstance();
for( int fI = 1; fI < fieldnames.length; fI++ ) {
bs.setStringValue( bean, fieldnames[fI], pRecord[fI], iUseDefault );
}
saveBean( bean );
return null;
}
else {
return new BadRecord( pRecordNumber, pRecord, "unknown code: "+codename );
}
}
protected void saveBean( Object pBean ) {
if( pBean instanceof Project ) {
if( iLastWasEmployee || iFirst ) {
if( !iFirst ) {
MakeTable.outputEndEmployeeTable( iPrintWriter );
}
else {
iFirst = false;
}
MakeTable.outputStartProjectTable( iPrintWriter );
}
MakeTable.outputProject( (Project) pBean, null, iPrintWriter );
iLastWasEmployee = false;
}
else if( pBean instanceof Employee ) {
if( !iLastWasEmployee || iFirst ) {
if( !iFirst ) {
MakeTable.outputEndProjectTable( iPrintWriter );
}
else {
iFirst = false;
}
MakeTable.outputStartEmployeeTable( iPrintWriter );
}
MakeTable.outputEmployee( (Employee) pBean, null, iPrintWriter );
iLastWasEmployee = true;
}
}
}
As you can see, the
The
And that's it. Now, how many records can this handle? Well, we've included a utility class called
MakeReallyBigFile.java. This takes a single numeric argument indicating how many
records you want to generate. Try it with 1000000. Go on, you know you want to! It generates an XML file called java -cp .;..\..\lib\xmlman.jar MakeTable reallybig.xml If the file is big enough, you'll probably get an java -cp .:../../lib/xmlman.jar MakeTable reallybig.xml stream To keep you updated, status information is output every 100 records. This includes memory usage. Compare the memory usage of the streaming and non-streaming versions. You'll notice that the non-streaming version just keeps using more memory until there's none left. The streaming version maintains a relatively constant memory load and will keep going until your hard disk is full. Source CodeHere is a list of all the files used in this example. Note that the actual source code is slightly longer than the examples above, which have been abridged for clarity.
Feel free to experiment with these classes and see what happens. You can also use them as the basis for your own streaming solution. Questions and CommentsPlease feel free to email us at examples@ricebridge.com if you have any questions or comments about this example. | |||||||||||||||||||||||||||||||||||||
|
comment on this page
Home |
Search |
About Us |
Contact Us |
Our Products |
Documentation |
Resources |
Login
Copyright © 2004-2012 Ricebridge. All Rights Reserved. | ||||||||||||||||||||||||||||||||||||||