Uploaded image for project: 'Help-Desk'
  1. Help-Desk
  2. HELP-15764

[fiware-stackoverflow] How to match column between HDFS File(save through cygnus) to Hive table?

    Details

      Description

      Created question in FIWARE Q/A platform on 02-05-2019 at 10:05
      Please, ANSWER this question AT https://stackoverflow.com/questions/55948021/how-to-match-column-between-hdfs-filesave-through-cygnus-to-hive-table

      Question:
      How to match column between HDFS File(save through cygnus) to Hive table?

      Description:
      I want save some data in hive through fiware-cygnus,

      I set the conf file,

      cygnus-ngsi.sources = http-source
      cygnus-ngsi.sinks = hdfs-sink
      cygnus-ngsi.channels = hdfs-channel

      cygnus-ngsi.sources.http-source.channels = hdfs-channel
      cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource
      cygnus-ngsi.sources.http-source.port = 5050
      cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
      cygnus-ngsi.sources.http-source.handler.notification_target = /notify
      cygnus-ngsi.sources.http-source.handler.default_service = def_serv
      cygnus-ngsi.sources.http-source.handler.default_service_path = /def_servpath
      cygnus-ngsi.sources.http-source.handler.events_ttl = 2
      cygnus-ngsi.sources.http-source.interceptors = ts gi
      cygnus-ngsi.sources.http-source.interceptors.ts.type = timestamp
      cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
      cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /Applications/apache-flume-1.4.0-bin/conf/grouping_rules.confi

      cygnus-ngsi.channels.hdfs-channel.type = memory
      cygnus-ngsi.channels.hdfs-channel.capacity = 1000
      cygnus-ngsi.channels.hdfs-channel.transactionCapacity = 100

      cygnus-ngsi.sinks.hdfs-sink.type = com.telefonica.iot.cygnus.sinks.NGSIHDFSSink
      cygnus-ngsi.sinks.hdfs-sink.channel = hdfs-channel
      cygnus-ngsi.sinks.hdfs-sink.enable_encoding = true
      cygnus-ngsi.sinks.hdfs-sink.enable_grouping = false
      cygnus-ngsi.sinks.hdfs-sink.enable_lowercase = false
      cygnus-ngsi.sinks.hdfs-sink.enable_name_mappings = false
      cygnus-ngsi.sinks.hdfs-sink.data_model = dm-by-entity
      cygnus-ngsi.sinks.hdfs-sink.file_format = json-row
      cygnus-ngsi.sinks.hdfs-sink.backend.impl = rest
      cygnus-ngsi.sinks.hdfs-sink.backend.max_conns = 500
      cygnus-ngsi.sinks.hdfs-sink.backend.max_conns_per_route = 100
      cygnus-ngsi.sinks.hdfs-sink.hdfs_host = [hdfs_host]
      cygnus-ngsi.sinks.hdfs-sink.hdfs_port = 50070
      cygnus-ngsi.sinks.hdfs-sink.hdfs_username = [username]
      cygnus-ngsi.sinks.hdfs-sink.hdfs_password = [password]
      cygnus-ngsi.sinks.hdfs-sink.oauth2_token = [token]
      cygnus-ngsi.sinks.hdfs-sink.service_as_namespace = false
      cygnus-ngsi.sinks.hdfs-sink.batch_size = 100
      cygnus-ngsi.sinks.hdfs-sink.batch_timeout = 30
      cygnus-ngsi.sinks.hdfs-sink.batch_ttl = 10
      cygnus-ngsi.sinks.hdfs-sink.batch_retry_intervals = 5000
      cygnus-ngsi.sinks.hdfs-sink.hive = true
      cygnus-ngsi.sinks.hdfs-sink.hive.server_version = 2
      cygnus-ngsi.sinks.hdfs-sink.hive.host = [hive_host]
      cygnus-ngsi.sinks.hdfs-sink.hive.port = 10000
      cygnus-ngsi.sinks.hdfs-sink.hive.db_type = default-db
      cygnus-ngsi.sinks.hdfs-sink.krb5_auth = false

      and I found HDFS file created,

      {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]} {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]} {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]} {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]}

      but, In hive, all data show null values.

      hive> select * from ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row;
      chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/ae8cd691-6e96-428d-8ece-02f6e6fbf527/hive_2019-05-02_16-37-44_821_8122085527913292696-1/mr-10001': Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w---
      OK
      ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054imex0054s ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054ime ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.fiwarex0053ervicex0050ath ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0049d ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004eame ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0056alue ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004dd
      NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL
      NULL NULL NULL NULL NULL NULL NULL NULL NULL
      Time taken: 0.26 seconds, Fetched: 4 row(s)

      External Table creating query in Cygnus Debug Console Message is,

      create external table
      if not exists default.ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row

      (recvx0054imex0054s bigint,
      recvx0054ime string,
      fiwarex0053ervicex0050ath string,
      entityx0049d string,
      entityx0054ype string,
      attrx004eame string,
      attrx0054ype string,
      attrx0056alue string,
      attrx004dd array<struct<name:string,type:string,value:string>>)

      row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true")

      location '/user/ngsi/Light/LightInfo/bedroom_light_Light'

      I guess the difference between hive table column name and HDFS File Attribute Name is the problem.
      Hive table column name used encoding name, but hdfs file attribute name use not encoding name.(ex: recvTimeTs - recvx0054imex0054s)

      so I created external table and tested.

      create external table if not exists default.test01 (recvx0054imex0054s bigint, recvx0054ime string, fiwarex0053ervicex0050ath string, entityx0049d string, entityx0054ype string, attrx004eame string, attrx0054ype string, attrx0056alue string,attrx004dd array<struct<name:string,type:string,value:string>>) row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true") location '/user/ngsi/Light/LightInfo/bedroom_light_Light'

      Result, show right values.

      hive> select * from test01;
      chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/bec8d67d-7d94-4cdc-a577-14d990ca342c/hive_2019-05-02_16-42-59_290_896718721577759467-1/mr-10001': Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w---
      OK
      1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light lux Float 7 []
      1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light status string off []
      1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light lux Float 7 []
      1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light status string off []
      Time taken: 2.329 seconds, Fetched: 4 row(s)

      How to match between Hive Table Column name and HDFS File Attribute name?

      or How to use cygnus and hive other way?

        Activity

        backlogmanager Backlog Manager created issue -
        Hide
        backlogmanager Backlog Manager added a comment -

        2019-05-02 12:06|CREATED monitor | # answers= 0, accepted answer= False

        Show
        backlogmanager Backlog Manager added a comment - 2019-05-02 12:06|CREATED monitor | # answers= 0, accepted answer= False
        backlogmanager Backlog Manager made changes -
        Field Original Value New Value
        Component/s FIWARE-TECH-HELP [ 10278 ]
        fla Fernando Lopez made changes -
        HD-Enabler Cygnus [ 11304 ]
        Description
        Created question in FIWARE Q/A platform on 02-05-2019 at 10:05
        {color: red}Please, ANSWER this question AT{color} https://stackoverflow.com/questions/55948021/how-to-match-column-between-hdfs-filesave-through-cygnus-to-hive-table


        +Question:+
        How to match column between HDFS File(save through cygnus) to Hive table?

        +Description:+
        I want save some data in hive through fiware-cygnus,

        I set the conf file,

        cygnus-ngsi.sources = http-source
        cygnus-ngsi.sinks = hdfs-sink
        cygnus-ngsi.channels = hdfs-channel

        cygnus-ngsi.sources.http-source.channels = hdfs-channel
        cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource
        cygnus-ngsi.sources.http-source.port = 5050
        cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
        cygnus-ngsi.sources.http-source.handler.notification_target = /notify
        cygnus-ngsi.sources.http-source.handler.default_service = def_serv
        cygnus-ngsi.sources.http-source.handler.default_service_path = /def_servpath
        cygnus-ngsi.sources.http-source.handler.events_ttl = 2
        cygnus-ngsi.sources.http-source.interceptors = ts gi
        cygnus-ngsi.sources.http-source.interceptors.ts.type = timestamp
        cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
        cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /Applications/apache-flume-1.4.0-bin/conf/grouping_rules.confi

        cygnus-ngsi.channels.hdfs-channel.type = memory
        cygnus-ngsi.channels.hdfs-channel.capacity = 1000
        cygnus-ngsi.channels.hdfs-channel.transactionCapacity = 100

        cygnus-ngsi.sinks.hdfs-sink.type = com.telefonica.iot.cygnus.sinks.NGSIHDFSSink
        cygnus-ngsi.sinks.hdfs-sink.channel = hdfs-channel
        cygnus-ngsi.sinks.hdfs-sink.enable_encoding = true
        cygnus-ngsi.sinks.hdfs-sink.enable_grouping = false
        cygnus-ngsi.sinks.hdfs-sink.enable_lowercase = false
        cygnus-ngsi.sinks.hdfs-sink.enable_name_mappings = false
        cygnus-ngsi.sinks.hdfs-sink.data_model = dm-by-entity
        cygnus-ngsi.sinks.hdfs-sink.file_format = json-row
        cygnus-ngsi.sinks.hdfs-sink.backend.impl = rest
        cygnus-ngsi.sinks.hdfs-sink.backend.max_conns = 500
        cygnus-ngsi.sinks.hdfs-sink.backend.max_conns_per_route = 100
        cygnus-ngsi.sinks.hdfs-sink.hdfs_host = [hdfs_host]
        cygnus-ngsi.sinks.hdfs-sink.hdfs_port = 50070
        cygnus-ngsi.sinks.hdfs-sink.hdfs_username = [username]
        cygnus-ngsi.sinks.hdfs-sink.hdfs_password = [password]
        cygnus-ngsi.sinks.hdfs-sink.oauth2_token = [token]
        cygnus-ngsi.sinks.hdfs-sink.service_as_namespace = false
        cygnus-ngsi.sinks.hdfs-sink.batch_size = 100
        cygnus-ngsi.sinks.hdfs-sink.batch_timeout = 30
        cygnus-ngsi.sinks.hdfs-sink.batch_ttl = 10
        cygnus-ngsi.sinks.hdfs-sink.batch_retry_intervals = 5000
        cygnus-ngsi.sinks.hdfs-sink.hive = true
        cygnus-ngsi.sinks.hdfs-sink.hive.server_version = 2
        cygnus-ngsi.sinks.hdfs-sink.hive.host = [hive_host]
        cygnus-ngsi.sinks.hdfs-sink.hive.port = 10000
        cygnus-ngsi.sinks.hdfs-sink.hive.db_type = default-db
        cygnus-ngsi.sinks.hdfs-sink.krb5_auth = false


        and I found HDFS file created,

        {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]}
        {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]}
        {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]}
        {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]}


        but, In hive, all data show null values.

        hive&gt; select * from ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row;
        chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/ae8cd691-6e96-428d-8ece-02f6e6fbf527/hive_2019-05-02_16-37-44_821_8122085527913292696-1/-mr-10001&#39;: Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w----
        OK
        ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054imex0054s ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054ime ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.fiwarex0053ervicex0050ath ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0049d ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004eame ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0056alue ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004dd
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        Time taken: 0.26 seconds, Fetched: 4 row(s)


        External Table creating query in Cygnus Debug Console Message is,

        create external table
            if not exists default.ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row

            (recvx0054imex0054s bigint,
            recvx0054ime string,
            fiwarex0053ervicex0050ath string,
            entityx0049d string,
            entityx0054ype string,
            attrx004eame string,
            attrx0054ype string,
            attrx0056alue string,
            attrx004dd array&lt;struct&lt;name:string,type:string,value:string&gt;&gt;)

            row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true")

            location '/user/ngsi/Light/LightInfo/bedroom_light_Light'


        I guess the difference between hive table column name and HDFS File Attribute Name is the problem.
        Hive table column name used encoding name, but hdfs file attribute name use not encoding name.(ex: recvTimeTs - recvx0054imex0054s)

        so I created external table and tested.

        create external table if not exists default.test01 (recvx0054imex0054s bigint, recvx0054ime string, fiwarex0053ervicex0050ath string, entityx0049d string, entityx0054ype string, attrx004eame string, attrx0054ype string, attrx0056alue string,attrx004dd array&lt;struct&lt;name:string,type:string,value:string&gt;&gt;) row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true") location '/user/ngsi/Light/LightInfo/bedroom_light_Light'


        Result, show right values.

        hive&gt; select * from test01;
        chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/bec8d67d-7d94-4cdc-a577-14d990ca342c/hive_2019-05-02_16-42-59_290_896718721577759467-1/-mr-10001&#39;: Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w----
        OK
        1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light lux Float 7 []
        1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light status string off []
        1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light lux Float 7 []
        1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light status string off []
        Time taken: 2.329 seconds, Fetched: 4 row(s)


        How to match between Hive Table Column name and HDFS File Attribute name?

        or How to use cygnus and hive other way?
        Created question in FIWARE Q/A platform on 02-05-2019 at 10:05
        {color: red}Please, ANSWER this question AT{color} https://stackoverflow.com/questions/55948021/how-to-match-column-between-hdfs-filesave-through-cygnus-to-hive-table


        +Question:+
        How to match column between HDFS File(save through cygnus) to Hive table?

        +Description:+
        I want save some data in hive through fiware-cygnus,

        I set the conf file,

        cygnus-ngsi.sources = http-source
        cygnus-ngsi.sinks = hdfs-sink
        cygnus-ngsi.channels = hdfs-channel

        cygnus-ngsi.sources.http-source.channels = hdfs-channel
        cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource
        cygnus-ngsi.sources.http-source.port = 5050
        cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
        cygnus-ngsi.sources.http-source.handler.notification_target = /notify
        cygnus-ngsi.sources.http-source.handler.default_service = def_serv
        cygnus-ngsi.sources.http-source.handler.default_service_path = /def_servpath
        cygnus-ngsi.sources.http-source.handler.events_ttl = 2
        cygnus-ngsi.sources.http-source.interceptors = ts gi
        cygnus-ngsi.sources.http-source.interceptors.ts.type = timestamp
        cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
        cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /Applications/apache-flume-1.4.0-bin/conf/grouping_rules.confi

        cygnus-ngsi.channels.hdfs-channel.type = memory
        cygnus-ngsi.channels.hdfs-channel.capacity = 1000
        cygnus-ngsi.channels.hdfs-channel.transactionCapacity = 100

        cygnus-ngsi.sinks.hdfs-sink.type = com.telefonica.iot.cygnus.sinks.NGSIHDFSSink
        cygnus-ngsi.sinks.hdfs-sink.channel = hdfs-channel
        cygnus-ngsi.sinks.hdfs-sink.enable_encoding = true
        cygnus-ngsi.sinks.hdfs-sink.enable_grouping = false
        cygnus-ngsi.sinks.hdfs-sink.enable_lowercase = false
        cygnus-ngsi.sinks.hdfs-sink.enable_name_mappings = false
        cygnus-ngsi.sinks.hdfs-sink.data_model = dm-by-entity
        cygnus-ngsi.sinks.hdfs-sink.file_format = json-row
        cygnus-ngsi.sinks.hdfs-sink.backend.impl = rest
        cygnus-ngsi.sinks.hdfs-sink.backend.max_conns = 500
        cygnus-ngsi.sinks.hdfs-sink.backend.max_conns_per_route = 100
        cygnus-ngsi.sinks.hdfs-sink.hdfs_host = [hdfs_host]
        cygnus-ngsi.sinks.hdfs-sink.hdfs_port = 50070
        cygnus-ngsi.sinks.hdfs-sink.hdfs_username = [username]
        cygnus-ngsi.sinks.hdfs-sink.hdfs_password = [password]
        cygnus-ngsi.sinks.hdfs-sink.oauth2_token = [token]
        cygnus-ngsi.sinks.hdfs-sink.service_as_namespace = false
        cygnus-ngsi.sinks.hdfs-sink.batch_size = 100
        cygnus-ngsi.sinks.hdfs-sink.batch_timeout = 30
        cygnus-ngsi.sinks.hdfs-sink.batch_ttl = 10
        cygnus-ngsi.sinks.hdfs-sink.batch_retry_intervals = 5000
        cygnus-ngsi.sinks.hdfs-sink.hive = true
        cygnus-ngsi.sinks.hdfs-sink.hive.server_version = 2
        cygnus-ngsi.sinks.hdfs-sink.hive.host = [hive_host]
        cygnus-ngsi.sinks.hdfs-sink.hive.port = 10000
        cygnus-ngsi.sinks.hdfs-sink.hive.db_type = default-db
        cygnus-ngsi.sinks.hdfs-sink.krb5_auth = false


        and I found HDFS file created,

        {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]}
        {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]}
        {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]}
        {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]}


        but, In hive, all data show null values.

        hive&gt; select * from ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row;
        chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/ae8cd691-6e96-428d-8ece-02f6e6fbf527/hive_2019-05-02_16-37-44_821_8122085527913292696-1/-mr-10001&#39;: Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w----
        OK
        ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054imex0054s ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054ime ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.fiwarex0053ervicex0050ath ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0049d ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004eame ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0056alue ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004dd
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        NULL NULL NULL NULL NULL NULL NULL NULL NULL
        Time taken: 0.26 seconds, Fetched: 4 row(s)


        External Table creating query in Cygnus Debug Console Message is,

        create external table
            if not exists default.ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row

            (recvx0054imex0054s bigint,
            recvx0054ime string,
            fiwarex0053ervicex0050ath string,
            entityx0049d string,
            entityx0054ype string,
            attrx004eame string,
            attrx0054ype string,
            attrx0056alue string,
            attrx004dd array&lt;struct&lt;name:string,type:string,value:string&gt;&gt;)

            row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true")

            location '/user/ngsi/Light/LightInfo/bedroom_light_Light'


        I guess the difference between hive table column name and HDFS File Attribute Name is the problem.
        Hive table column name used encoding name, but hdfs file attribute name use not encoding name.(ex: recvTimeTs - recvx0054imex0054s)

        so I created external table and tested.

        create external table if not exists default.test01 (recvx0054imex0054s bigint, recvx0054ime string, fiwarex0053ervicex0050ath string, entityx0049d string, entityx0054ype string, attrx004eame string, attrx0054ype string, attrx0056alue string,attrx004dd array&lt;struct&lt;name:string,type:string,value:string&gt;&gt;) row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true") location '/user/ngsi/Light/LightInfo/bedroom_light_Light'


        Result, show right values.

        hive&gt; select * from test01;
        chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/bec8d67d-7d94-4cdc-a577-14d990ca342c/hive_2019-05-02_16-42-59_290_896718721577759467-1/-mr-10001&#39;: Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w----
        OK
        1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light lux Float 7 []
        1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light status string off []
        1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light lux Float 7 []
        1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light status string off []
        Time taken: 2.329 seconds, Fetched: 4 row(s)


        How to match between Hive Table Column name and HDFS File Attribute name?

        or How to use cygnus and hive other way?
        fla Fernando Lopez made changes -
        Assignee Fermín Galán [ fermin ]

          People

          • Assignee:
            fermin Fermín Galán
            Reporter:
            backlogmanager Backlog Manager
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated: