/* Copyright (c) 2006 Ricebridge. All Rights Reserved. * * This file is available under the terms and conditions of the * Ricebridge "Open Source API" policy; Ricebridge grants use of this * copyrighted work under the terms of a BSD-style license only. See * http://www.opensource.org/licenses/bsd-license.php for more * information. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials provided * with the distribution. * * - Neither the name of the Ricebridge nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ package com.ricebridge.csvman; /** Extend this class to create your own {@link LineListener}. * <p>You can create custom data from CSV files by overriding and implementing the * methods in this class. The basic idea is that you implement the {@link #handleLineImpl handleLineImpl} * method which is called each time a line of CSV data is read from the CSV file. You can then use this data * in whatever way you need.</p> * <p>Here's a very simple example. This <code>LineListener</code> just prints out any data that is loaded. You might use * this for debugging or logging.</p> * <pre><!--code-[--> * import com.ricebridge.csvman.*; * * public class PrintingLineListener extends CustomLineListener { * protected BadLine handleLineImpl( String[] pLine, int pNumFields, * long pLineNumber, String pOriginalLine ) * { * for( int field = 0; field < pNumFields; field++ ) { * System.out.print( pLine[field] + ", " ); * } * System.out.println(); * return null; * } * } * <!--code-]--></pre> * <p>You can see from this code that instead of implementing the public {@link LineListener#handleLine LineListener.handleLine} * method directly, you implement the protected {@link #handleLineImpl handleLineImpl} method. This insulates you from * API changes and also provides an extra layer of error checking. And <code>CustomLineListener</code> provides default * implementations for most of the methods of <code>LineListener</code>, * so <code>handleLineImpl</code> is the only one you need to worry about.</p> * <p>The <code>handleLineImpl</code> method itself provides you with a lot of information about the * line of CSV data that was just loaded. You get a <code>String[]</code> array (<code>pLine</code>), * containing all the data. In order to reduce errors, this array is always the same size. For lines that * are missing data fields, <i>CSV Manager</i> will add in empty <code>String</code> elements at the end of the array.</p> * <p>Of course, you may want to know exactly how many fields there were in the line, so that's what <code>pNumFields</code> tell you. * When some data fields are missing, <code>pNumFields</code> with then be less than <code>pLine.length</code>.</p> * <p><i>CSV Manager</i> also gives you the line number (<code>pLineNumber</code>), * and the text of the original line (<code>pOriginalLine</code>), as it appears in the CSV file. * This is so that you can produce nice error messages and logs.</p> * <p>Here's how you actually use this class:</p> * <pre><!--code-[--> * CsvManager csvman = new CsvManager(); * PrintingLineListener pln = new PrintingLineListener(); * csvManager.load( "input-file.csv", pln ); * <!--code-]--></pre> * <p>You just create a new instance of <code>PrintingLineListener</code>, * and then pass it to {@link CsvManager}.</p> * <p>When using custom <code>LineListeners</code>, the only thing that is * different from the normal <code>load</code> methods is that there is no return value. * All the data is passed to the custom <code>LineListener</code> instead. * You can still access this data by calling <code>get</code> methods on the custom <code>LineListener</code>. * Of course, you'll have to write those <code>get</code> methods yourself.</p> * <p>For more information about implementing your own <code>LineListeners</code>, * see the {@link LineListener} interface documentation. Also, take a look at the other methods you can implement:</p> * <ul><li>{@link #setCsvSpecImpl setCsvSpecImpl}</li> * <li>{@link #setLineSpecImpl setLineSpecImpl}</li> * <li>{@link #startProcessImpl startProcessImpl}</li> * <li>{@link #endProcessImpl endProcessImpl}</li> * <li>{@link #handleBadLineImpl handleBadLineImpl}</li></ul> * <p>Note: You can implement the {@link LineListener} interface directly if you need to. The danger is that your * implementation may not remain compatible with future versions of <i>CSV Manager</i>. And you lose all the extra error-handling. * So it's better to stick with extending <code>CustomLineListener</code> if you can.</p> * <p>Finally, if you want to save data with custom processing, see {@link LineProvider}.</p> * <p>The <b><a href="CustomLineListener.java.html">Source Code</a></b> of this Java class * is available under a <a href="http://www.opensource.org/licenses/bsd-license.php">BSD-style license</a>.</p> * * @since 1.2.1 * @see LineListener * @see LineListenerSupportImpl * @see BasicLineListener * @see CsvManager * @see CsvManagerException */ public abstract class CustomLineListener extends LineListenerSupportImpl { // protected methods /** Set the current {@link CsvSpec} used for loading CSV files. * <p>You can implement this method when you extend <code>CustomLineListener</code>, * but it is not required.</p> * <p>The <code>CsvSpec</codse> controls the CSV loading and saving * process. It contains a number of settings such as the data field and line separators. You can also * set your own custom settings using the {@link CsvSpec#setProperty CsvSpec.setProperty} method. * You can then access these settings inside your own {@link LineListener} using the <code>CsvSpec</code> object * passed into this method.</p> * <p>This method is called before {@link #setLineSpecImpl setLineSpecImpl} is called.</p> * @param pCsvSpec {@link CsvSpec} object * @see CsvSpec * @see #setLineSpecImpl setLineSpecImpl */ protected void setCsvSpecImpl( CsvSpec pCsvSpec ) throws Exception { // default does nothing } /** Set the current {@link LineSpec} used for interpreting CSV data fields. * <p>You can implement this method when you extend <code>CustomLineListener</code>, * but it is not required.</p> * <p>The <code>LineSpec</code> controls the conversion of individual CSV data fields into Java objects. * Whereas {@link CsvSpec} controls the entire process, <code>LineSpec</code> only applies to each data field. * In the current version of <i>CSV Manager</i> (1.2), <code>LineSpec</code> is used to load and save Java Beans, * by providing the <code>get</code> and <code>set</code> method names for each data field. * See {@link BeanLineListener} for more details.</p> * <p>You can subclass <code>LineSpec</code> to add your own * data field specific information for your own custom {@link LineListener LineListeners}. * You can then access these settings inside your own <code>LineListener</code> * using the <code>LineSpec</code> object passed into this method.</p> * <p>This method is called after {@link #setCsvSpecImpl setCsvSpecImpl} is called.</p> * @param pLineSpec {@link LineSpec} object * @see LineSpec * @see #setCsvSpecImpl setCsvSpecImpl * @see BeanLineListener */ protected void setLineSpecImpl( LineSpec pLineSpec ) throws Exception { // default does nothing } /** Implement this method to receive notification that the loading of CSV data is about to start. * <p>You can implement this method when you extend <code>CustomLineListener</code>, * but it is not required.</p> * <p>You can use this method to initialise any resource you need to process the CSV data. * For example you can open a database connection to store the data as it is loaded.</p> * <p>This method is called <i>after</i> {@link #setCsvSpecImpl setCsvSpecImpl} and * {@link #setLineSpecImpl setLineSpecImpl}. * @see #endProcessImpl endProcessImpl */ protected void startProcessImpl() throws Exception { // default does nothing } /** Implement this method to receive each data line as it is loaded. * <p>This method must be implemented when you extend <code>CustomLineListener</code>.</p> * <p>This method is where you will do the main work of processing the CSV data. As you get each line * in, you can decide what to do with the data. The parameters of this method provide you with * a lot of information about the CSV data line that you can use in your application.</p> * <p>First, the <code>pLine</code> parameter contains the actual data as a * <code>String[]</code> array. This array is guaranteed not to contain any <code>null</code> Strings. * If empty data fields are found in the CSV line (for example <code>a,,b => ['a','','b']</code>) * then empty strings are placed in the array. This means that you can avoid nasty * {@link NullPointerException NullPointerExceptions}.</p> * <p>Equally nasty are {@link ArrayIndexOutOfBoundsException ArrayIndexOutOfBoundsExceptions}. * <i>CSV Manager</i> helps you avoid them by making sure that the <code>pLine</code> array is always * long enough. By "long enough", we mean either as long as the longest line found so far, or as long * as is specified by the {@link CsvSpec#setNumFields CsvSpec.setNumFields} method.</p> * <p>Of course, this means that in the case where there are fewer data fields than normal, you also * need to know exactly how many data fields there actually were, * as <code>pLine.length</code> will not tell you this. This is what the <code>pNumFields</code> * parameter is for. So if you need to check exactly how many data fields a line had, use * <code>pNumFields</code>.</p> * <p>To help with error reporting, <i>CSV Manager</i> also provides the line number of the * current data line, passed in via <code>pLineNumber</code>. This includes any bad lines found. * <code>pLineNumber</code> is a <code>long</code>, just in case you ever have a really, * really big CSV file.</p> * <p>Finally, you also get the text of the original line (<code>pOriginalLine</code>), so you can create * user-friendly error messages for your users. And it makes debugging easier.</p> * <p><b>Error Handling:</b> What happens when the data in the CSV file is incorrect in some way? * For example, it might not be valid for your database. In this case, even though the <i>syntax</i> * of the CSV is correct, there is a <i>semantic</i> error. To capture this case, we use the following * contract: if all is well, return a <code>null</code> from <code>handleBadLineImpl</code>. * If there is an error with the data, return a {@link BadLine} object describing the error.</p> * <p>This provides consistent handling of errors. At any time you can of course * just throw an {@link Exception}, but if you use a <code>BadLine</code> instead then you get * proper summary statistics, nice error reporting, and faster performance as you avoid the overhead * of <code>Exception</code> throwing and catching.</p> * <p>But if you have a error that is not data related, (if for example, your database goes down), then it is * better to throw an <code>Exception</code>.</p> * @param pLine String values of data fields in line * @param pNumFields Number of data fields actually found on the current line * @param pLineNumber Count of lines processed so far. * @param pOriginalLine Text of original data line from data source * @return <code>null</code> if line is OK, {@link BadLine} object if line was bad in some way * @see LineListener#handleLine LineListener.handleLine * @see BadLine * @see #handleBadLineImpl handleBadLineImpl */ protected abstract BadLine handleLineImpl( String[] pLine, int pNumFields, long pLineNumber, String pOriginalLine ) throws Exception; /** Implement this method to be notified when badly formatted data is encountered. * <p>You can implement this method when you extend <code>CustomLineListener</code>, * but it is not required.</p> * <p>When a syntax error is encountered in the CSV file you are loading, a {@link BadLine} object is * created by <i>CSV Manager</i> to describe the problem, and then it is passed to your custom * {@link LineListener} for further handling. You can decide how to log the error or what other actions * to take based on your error handling policy.</p> * <p>What happens after <code>handleBadLineImpl</code> is called? It depends on the * {@link CsvSpec#setIgnoreBadLines CsvSpec.setIgnoreBadLines} setting. If this setting is * <code>true</code>, then loading will continue with the rest of the CSV file. If it is * <code>false</code>, then loading will halt and a {@link CsvManagerException} will be thrown for * the code that called the {@link CsvManager#load(Object,LineListener) CsvManager.load} method to catch. * <p>Note: <code>BadLines</code> returned by {@link #handleLineImpl handleLineImpl} are <i>not</i> * passed to this method. Since you already know about them in <code>handeLineImpl</code>, there would not * be much point.</p> * @param pBadLine {@link BadLine} object describing the error * @see LineListener#handleBadLine LineListener.handleBadLine * @see #handleLineImpl handleLineImpl */ protected void handleBadLineImpl( BadLine pBadLine ) throws Exception { // do nothing - pBadLine recorded by caller } /** Implement this method to receive notification that the loading of CSV data has ended. * <p>You can implement this method when you extend <code>CustomLineListener</code>, * but it is not required.</p> * <p>You can use this method to close any open resources that were used to handle the * CSV data. For example you can close any open database connections.</p> * <p>This method is called <i>last</i>, after all {@link #handleLineImpl handleLineImpl} * and {@link #handleBadLineImpl handleBadLineImpl} calls have been made.</p> * @see #startProcessImpl startProcessImpl */ protected void endProcessImpl() throws Exception { // default does nothing } }