原本用sqoop从db2的抽数脚本忽然提示错误,如下
16/11/09 15:37:53 INFO mapreduce.Job: Task Id : attempt_1472462720075_14348_m_000004_0, Status : FAILED
Error: java.io.IOException: SQLException in nextKeyValue at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:266) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.16.53] 捕获到 java.io.CharConversionException。有关详细信息,请参阅相连接的 Throwable。 ERRORCODE=-4220, SQLSTATE=null at com.ibm.db2.jcc.am.fd.a(fd.java:723) at com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:112) at com.ibm.db2.jcc.am.jc.a(jc.java:2870) at com.ibm.db2.jcc.am.jc.p(jc.java:527) at com.ibm.db2.jcc.am.jc.N(jc.java:1563) at com.ibm.db2.jcc.am.ResultSet.getStringX(ResultSet.java:1153) at com.ibm.db2.jcc.am.ResultSet.getString(ResultSet.java:1128) at org.apache.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:71) at com.cloudera.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:61) at QueryResult.readFields(QueryResult.java:210) at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:246) ... 12 more Caused by: java.nio.charset.MalformedInputException: Input length = 67277 at com.ibm.db2.jcc.am.r.a(r.java:19) at com.ibm.db2.jcc.am.jc.a(jc.java:2862) ... 20 more Caused by: sun.io.MalformedInputException at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:105) at com.ibm.db2.jcc.am.r.a(r.java:16) ... 21 more大概的原因如下:
When an application uses the IBM Data Server Driver for JDBC and SQLJ (also known as the JCC driver) and is connected to a database with code set UTF-8 (code page 1208), it throws an
SqlException with message including "Caught java.io.CharConversionException" and ERRORCODE=-4220 if the data in a character column that it queries contains a sequence of bytes that is not a valid UTF-8 string.解决如下:
下载对应版本的最新jdbc驱动包
http://www-01.ibm.com/support/docview.wss?uid=swg21363866
可在程序中添加
System.setProperty("db2.jcc.charsetDecoderEncoder", "3");
或
java -Ddb2.jcc.charsetDecoderEncoder=3 MyApp
因为当前的作业使用的mr来运行,所以在mapred-site.xml中做如下修改即可:
<property>
<name>mapreduce.map.java.opts</name> <value>-Xmx1024m -Ddb2.jcc.charsetDecoderEncoder=3</value> </property>修改后无需重启集群