Uploaded image for project: 'Help-Desk'
  1. Help-Desk
  2. HELP-325

FIWARE.Request.Tech.Data.BigData-Analysis.Parquet

    Details

      Description

      Hello can you please direct this question to right person.
      Thank you.
      ---------- Forwarded message ----------
      From: Dave Watson <watson.dave@gmail.com>
      Date: Mon, Sep 29, 2014 at 7:07 PM
      Subject: Question re FIWARE COSMOS
      To: francisco.romerobueno@telefonica.com, frb@tid.es

      Hello Francisco:

      We attended FI-WARE meeting today in London and it was suggested we direct
      question to you.

      1) Do you support Parquet files in the HDFS? (Your documentation only shows
      example of CSV file)

      2) Will the version of Hadoop in Cosmos be upgraded so we can work with
      Spark and Drill?

      3) What is the memory configuration for nodes in the FIWARE Lab and Testbed
      instances?

      See below regarding Drill 0.5 version support:

      Drill now uses the Hadoop 2.4.1 APIs. This includes upgrading Parquet to
      use direct memory and the ability to write larger Parquet files

      https://blogs.apache.org/drill/entry/apache_drill_beta_release_see

      https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+0.5.0+Release+Notes

      Thanks for your help here.

      Dave

      Dave Watson
      +447741675325

      Dave Watson
      +447741675325

      _______________________________________________
      Fiware-tech-help mailing list
      Fiware-tech-help@lists.fi-ware.org
      https://lists.fi-ware.org/listinfo/fiware-tech-help

        Activity

        Hide
        frb Francisco Romero added a comment -

        Hi Dave,

        Sorry for the delay in answering you, I’ve been participating in several FIWARE dissemination events.

        Regarding Parquet and Drill, if you have entered the global instance of Cosmos you’ll probably have seen the Hadoop version we are currently using is pretty old. In addition, it is quite limited in resources. It was initially addressed to hackathonists and newbies trying to get an easy approach to the Hadoop ecosystem, but not for more advanced usage. We are working on deploying a, let’s say, more “professional” Big Data platform, and I hope it is ready before the end of the year. In the meantime, I’ll try to stress the current deployment and I’ll study if any of those technologies can be exposed to the FILAB users, but as you are pointing, Drill requires recent Hadoop APIs.

        Regards,
        Francisco

        Show
        frb Francisco Romero added a comment - Hi Dave, Sorry for the delay in answering you, I’ve been participating in several FIWARE dissemination events. Regarding Parquet and Drill, if you have entered the global instance of Cosmos you’ll probably have seen the Hadoop version we are currently using is pretty old. In addition, it is quite limited in resources. It was initially addressed to hackathonists and newbies trying to get an easy approach to the Hadoop ecosystem, but not for more advanced usage. We are working on deploying a, let’s say, more “professional” Big Data platform, and I hope it is ready before the end of the year. In the meantime, I’ll try to stress the current deployment and I’ll study if any of those technologies can be exposed to the FILAB users, but as you are pointing, Drill requires recent Hadoop APIs. Regards, Francisco

          People

          • Assignee:
            frb Francisco Romero
            Reporter:
            fw.ext.user FW External User
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: