`

[java]csv数据解析

阅读更多
在web前端向后台传送数组时, 经常会遇到一些特殊字符, 如, " ' 之类的, 采用CSV格式可以很好的解决这一问题, 只是在服务器端要解析起来就比较麻烦,

CSV解析:
其实CSV的规则是有点麻烦, 可以归类为CSV模式和非CSV模式,
CSV模式, 为字符串里含有 comma 或 double quotation 这样的字符串, excel导出时会添加"到字符串的开头和结尾, 如果, 字符串里身身含有", 则在"前再加一个", 即""

NON-CSV模式, 就是正常的模式

另外, 2个字符串之间的空格是无效的, 即trim功能(按说应该这样, 但在excel2003里有BUG, 所以要避免这样的情况

算法: 与其用regexp, 还不如用java直接做来得方便, 这可以当年C的经常考的类型来了



package com.ys.util;

public class StringUtil{

	/**
	 * build array(String) with csv format
	 * @param param
	 * @return
	 * @throws Exception CSV Format error
	 */
	public static String[] buildCsvArray(String param) throws Exception{
		List<String> list=new ArrayList<String>();
		int len=param.length();
		char comma=',';
		char quot='"';
		boolean csvMode=false;		//start and end with double quotation

		//abc, ,, "n/a, n/a", "", "abc"
		int i=0;
		while(i<len){
			char key=param.charAt(i);
			if(len>0 && quot==key){
				csvMode=true;
				i++;
			}else{
				csvMode=false;
			}
			if(i<len){
				if(csvMode)
					i=processingCSV(list, param, i);
				else
					i=processingWord(list, param, i);
			}
		}
		//add text if last char is comma, example: "a,b," -> "a", "b", ""
		if(comma==param.charAt(len-1))
			list.add("");
		String[] text=list.toArray(new String[0]);
		return text;
		
	}
	
	/**
	 * process the space between work and work, and return next work start index 
	 * @param param csv string "abc,    def"
	 * @param i the previous word end index, sample: 4
	 * @return  the next work start index, samele: 8 
	 */
	private static int processClose(String param, int i){
		int len=param.length();
		if(i>=len)
			return i;
		char c=param.charAt(i);
		while(' '==c){
			i++;
			c=param.charAt(i);
		}
		return i;
	}
	
	/**
	 * generate word in csv mode, and add into list
	 * @param list 
	 * @param param csv string
	 * @param i work start index
	 * @return end of this work's index
	 */
	private static int processingCSV(List<String> list, String param, int i){
		StringBuffer sb=new StringBuffer("");
		int len=param.length();
		if(i+1>=len){
			list.add(sb.toString());
			return i;
		}
		char c=param.charAt(i);
		char cc=param.charAt(i+1);
		char comma=',';
		char quot='"';
		while(c!=quot || cc!=comma){
			if(c==quot && cc==quot)
				i++;
			sb.append(c);
			c=param.charAt(++i);
			if(i+1>=len)
				break;
			cc=param.charAt(i+1);
		}
		list.add(sb.toString());
		i=processClose(param, i+2);
		return i;
	}
	
	/**
	 * generate word in non-csv mode, and add into list
	 * @param list
	 * @param param string
	 * @param i start of this work's index
	 * @return end of this work's index
	 */
	private static int processingWord(List<String> list, String param, int i){
		StringBuffer sb=new StringBuffer("");
		int len=param.length();
		if(i>=len){
			list.add(sb.toString());
			return i;
		}
		char comma=',';
		for(;i<len;i++){
			char c=param.charAt(i);
			if(c!=comma)
				sb.append(c);
			else
				break;
		}
		list.add(sb.toString());
		i=processClose(param, ++i);
		return i;
	}

}


单元测试

package com.ys.util;

import junit.framework.TestCase;

public class StringUtilTest extends TestCase {

	public void testToStringStringArray() {
		/**
		 *  input these String into excel
		 *  abc
		 *  ab,c
		 *  a"b"c
		 *  a"b,c",d
		 *  abc
		 *  save as csv file
		 */
		String param="abc,\"ab,c\",\"a\"\"b\"\"c\",\"a\"\"b,c\"\",d\",abc";
		String[] result=null;
		try {
			result = StringUtil.buildCsvArray(param);
			for(String text : result){
				System.out.println(text);
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
		boolean condition=(null!=result)?true:false;
		assertTrue(condition);
	}

}
  • 大小: 66.6 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics