项目需要,在脱离hadoop的环境下进行lzo压缩,折腾了好几天,总结一下备忘
【首先lzo不像gzip那样,引入个包就能压缩,lzo需要调用底层c/c++库,所以一般都是在linux下先安装lzo压缩工具,生成native lib才可以压缩。安装lzo需要ant编译工具支持,所以如果没有ant,需要先安装,主要参照了这篇博客http://www.cnblogs.com/chaoboma/archive/2013/04/27/3047625.html】
1.首先安装ant工具:
tar -zxvf apache-ant-1.8.2-bin.tar.gz 先解压缩,
一般都是移动apache-ant-1.8.2到/usr/local/目录
添加ant的环境变量: vi /etc/profile
export ANT_HOME=/usr/local/apache-ant-1.8.2
export PATH=$PATH:$ANT_HOME/bin
source /etc/profile 使配置文件生效
2.安装lzo库 http://www.oberhumer.com/opensource/lzo/download/lzo-2.04.tar.gz
tar -zxf lzo-2.04.tar.gz
进入解压后的目录 cd lzo-2.04执行
./configure --enable-shared
make && make install
1)拷贝/usr/local/lib目录下的lzo库文件到/usr/lib(32位平台),或/usr/lib64(64位平台)
2)在/etc/ld.so.conf.d/目录下新建lzo.conf文件,写入lzo库文件的路径/usr/local/lib,
然后运行/sbin/ldconfig -v,使配置生效
3.安装lzo编码/解码器(注意32位跟64位的差别)
http://pkgs.repoforge.org/lzo/
下载lzo-devel-2.04-1.el5.rf.i386.rpm 和 lzo-2.04-1.el5.rf.i386.rpm
这里是由于缺少lzo-devel依赖的原因,lzo-devel有lzo- 2.04-1.el5.rf的依赖
rpm -ivh lzo-2.04-1.el5.rf.i386.rpm
rpm -ivh lzo-devel-2.04-1.el5.rf.i386.rpm
如果报错,重新执行一下试试
4.编译hadoop lzo jar
https://github.com/kevinweil/hadoop-lzo/downloads 上下载最新版本源码 目前为hadoop-lzo-master.zip
unzip hadoop-lzo-master.zip
cd hadoop-lzo-master
32位服务器 export CFLAGS=-m32 export CXXFLAGS=-m32
64位服务器 export CFLAGS=-m64 export CXXFLAGS=-m64
ant compile-native tar 这是可以看到build目录中,hadoop-lzo*.jar就躺在里面
(编译时看看有没有build.xml,我第一次下载的jar就没有,后来又找地方重新下载的,而且如果linux不能连接外网的话,需要修改bulid.xml及ivysetting.xml,把maven相关的地址修改为内网地址,还要注意测试java工程中引入的是hadoop-gpl-compression-0.1.0.jar就报初始化native-lib错误的异常,把java工程里面的jar删掉,引入当前ant生成的hadoop-lzo-0.4.15.jar)
安装成功后生成build文件,/build/native/Linux-amd64-64/lib里面就是native lib,所以需要指定到LD_LIBRARY_PATH,即修改/etc/profile文件即可,添加LZO_LIBRARY_PATH=/root/gaopeng/soft/hadoop-lzo-master/build/native/Linux-amd64-64/lib
export LD_LIBRARY_PATH=$LZO_LIBRARY_PATH
或者拷贝到/usr/lib或者lib64下面,再指定到/usr/lib或lib64也一样的
5.接下来就可以参照这篇博客http://www.cnblogs.com/xuxm2007/archive/2012/06/15/2550996.html
在脱离hadoop集群的情况下进行lzo压缩的测试了
package com.jd.gp.lzotest;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionInputStream;
import org.apache.hadoop.io.compress.CompressionOutputStream;
import org.apache.hadoop.util.ReflectionUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Hello world!
*
*/
public class LzoStart {
private static final Logger logger = LoggerFactory.getLogger(LzoStart.class);
public static void main(String[] args) throws IOException {
logger.info("start...");
Configuration conf = new Configuration();
CompressionOutputStream out = null;
java.io.InputStream in = null;
CompressionInputStream ins = null;
try {
Class<?> codecClass = Class.forName("com.hadoop.compression.lzo.LzoCodec");
CompressionCodec codec = (CompressionCodec) ReflectionUtils.newInstance(codecClass, conf);
String outfile = "/root/gaopeng/lzo/test.txt.lzo";
out = codec.createOutputStream(new java.io.FileOutputStream(outfile));
byte[] buffer = new byte[100];
logger.info("outfile=" + outfile + ",codec.getDefaultExtension()" + codec.getDefaultExtension());
String inFilename = removeSuffix(outfile, ".lzo");
logger.info("inFilename=" + inFilename);
in = new java.io.FileInputStream(inFilename);
long start = System.currentTimeMillis();
int len = in.read(buffer);
while (len > 0) {
logger.info("out=" + new String(buffer));
out.write(buffer, 0, len);
len = in.read(buffer);
}
if(len <= 0){
out.write(new byte[]{1,1,1,1,1});
logger.info("test ok!");
}
logger.info("压缩成功完成!time = " + (System.currentTimeMillis() - start)/1000 + "Second");
ins = codec.createInputStream(new java.io.FileInputStream(outfile));
len = ins.read(buffer);
while (len > 0) {
logger.info("in=" + new String(buffer));
len = ins.read(buffer);
}
logger.info("解压缩成功完成!all time = " + (System.currentTimeMillis() - start)/1000 + "Second");
} catch (Exception e) {
e.printStackTrace();
logger.error(e.getLocalizedMessage());
} finally{
if(out != null){
out.close();
out = null;
}
if(in != null){
in.close();
in = null;
}
if(ins != null){
ins.close();
ins = null;
}
}
}
/**
* Removes a suffix from a filename, if it has it.
* @param filename the filename to strip
* @param suffix the suffix to remove
* @return the shortened filename
*/
public static String removeSuffix(String filename, String suffix) {
if (filename.endsWith(suffix)) {
return filename.substring(0, filename.length() - suffix.length());
}
return filename;
}
}