Back to Datax

datax-kudu-plugins

kuduwriter/doc/kuduwirter.md

latest5.4 KB
Original Source

datax-kudu-plugins

datax kudu的writer插件

eg:

json
{
  "name": "kuduwriter",
  "parameter": {
    "kuduConfig": {
      "kudu.master_addresses": "***",
      "timeout": 60000,
      "sessionTimeout": 60000

    },
    "table": "",
    "replicaCount": 3,
    "truncate": false,
    "writeMode": "upsert",
    "partition": {
      "range": {
        "column1": [
          {
            "lower": "2020-08-25",
            "upper": "2020-08-26"
          },
          {
            "lower": "2020-08-26",
            "upper": "2020-08-27"
          },
          {
            "lower": "2020-08-27",
            "upper": "2020-08-28"
          }
        ]
      },
      "hash": {
        "column": [
          "column1"
        ],
        "number": 3
      }
    },
    "column": [
      {
        "index": 0,
        "name": "c1",
        "type": "string",
        "primaryKey": true
      },
      {
        "index": 1,
        "name": "c2",
        "type": "string",
        "compress": "DEFAULT_COMPRESSION",
        "encoding": "AUTO_ENCODING",
        "comment": "注解xxxx"
      }
    ],
    "batchSize": 1024,
    "bufferSize": 2048,
    "skipFail": false,
    "encoding": "UTF-8"
  }
}

必须参数:

json
        "writer": {
          "name": "kuduwriter",
          "parameter": {
            "kuduConfig": {
              "kudu.master_addresses": "***"
            },
            "table": "***",
            "column": [
              {
                "name": "c1",
                "type": "string",
                "primaryKey": true
              },
              {
                "name": "c2",
                "type": "string",
              },
              {
                "name": "c3",
                "type": "string"
              },
              {
                "name": "c4",
                "type": "string"
              }
            ]
          }
        }

主键列请写到最前面

配置列表
namedefaultdescription是否必须
kuduConfigkudu配置 (kudu.master_addresses等)
table导入目标表名
partition分区
column
name列名
typestring列的类型,现支持INT, FLOAT, STRING, BIGINT, DOUBLE, BOOLEAN, LONG。
index升序排列列索引位置(要么全部列都写,要么都不写),如reader中取到的某一字段在第二位置(eg: name, id, age)但kudu目标表结构不同(eg:id,name, age),此时就需要将index赋值为(1,0,2),默认顺序(0,1,2)
primaryKeyfalse是否为主键(请将所有的主键列写在前面),不表明主键将不会检查过滤脏数据
compressDEFAULT_COMPRESSION压缩格式
encodingAUTO_ENCODING编码
replicaCount3保留副本个数
hashhash分区
number3hash分区个数
rangerange分区
lowerrange分区下限 (eg: sql建表:partition value='haha' 对应:“lower”:“haha”,“upper”:“haha\000”)
upperrange分区上限(eg: sql建表:partition "10" <= VALUES < "20" 对应:“lower”:“10”,“upper”:“20”)
truncatefalse是否清空表,本质上是删表重建
writeModeupsertupsert,insert,update
batchSize512每xx行数据flush一次结果(最好不要超过1024)
bufferSize3072缓冲区大小
skipFailfalse是否跳过插入不成功的数据
timeout60000client超时时间,如创建表,删除表操作的超时时间。单位:ms
sessionTimeout60000session超时时间 单位:ms