Uploaded image for project: 'Help-Desk'
  1. Help-Desk
  2. HELP-5439

FIWARE.Request.Tech.Data.BigData-Analysis.Issue on Cosmos Global Instance: cannot transfer file parts

    Details

    • Type: extRequest
    • Status: Closed
    • Priority: Major
    • Resolution: Done
    • Fix Version/s: 2021
    • Component/s: FIWARE-TECH-HELP
    • Labels:
      None

      Description

      Dear FIWARE Staff,

      I would like to report a potential issue on the Cosmos Global Instance.

      We are using Cosmos to store input files for Hadoop jobs and their related
      output. We are transferring them from/to a remote back-end through the
      WebHDFS API.

      During outbound transfers of large files from Cosmos, the need to split the
      file content in parts has risen. If we use a simple OPEN operation like

      *http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=OPEN *

      the transfer is interrupted after 2 minutes even if the response has not
      transferred completely to the client. This should due to the timeout
      configuration of the HTTP Server deployed on Cosmos.

      As suggested by the WebHDFS API Doc
      <https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#OPEN>, one should
      consider to transfer single parts of the requested file by setting offset
      and length parameters to complete the request successfully.

      The issue we found is that the length parameter seems to be ignored by
      the HTTP Server.
      For example, if we execute the following

      http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=OPEN&offset=100&length=10

      on a 200 bytes file, we would get the last 100 bytes of the file itself,
      and not bytes from 101 to 110.

      We checked the current version of Hadoop deployed on Cosmos and we found
      out that release *0.20.2-cdh3u6 *is currently running. By giving a look to
      the HDFS changelogs
      <https://hadoop.apache.org/docs/r0.23.11/hadoop-project-dist/hadoop-hdfs/CHANGES.txt>,
      there is an improvement (HDFS-3794) committed in release 0.23.3 that could
      be potentially related to the reported occurrence.

      Best Regards,
      Andrea Sassi

      Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost.
      Please, send your messages using the new domain (Fiware-tech-help@lists.fiware.org) instead of the old one.
      _______________________________________________
      Fiware-tech-help mailing list
      Fiware-tech-help@lists.fiware.org
      https://lists.fiware.org/listinfo/fiware-tech-help
      [Created via e-mail received from: Andrea Sassi <brainswitch@gmail.com>]

        Activity

        Hide
        fw.ext.user FW External User added a comment -

        The deployment started several months ago, this is a very ambitious task.
        We are in the latest steps, most probably it will be ready after the Xmas
        holidays.

        Regards,
        Francisco

        El 16/12/15 8:38, "Manuel Escriche (JIRA)" <jira-help-desk@fi-ware.org>

        ________________________________

        Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción.

        The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it.

        Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição
        Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost.
        Please, send your messages using the new domain (Fiware-tech-help@lists.fiware.org) instead of the old one.
        _______________________________________________
        Fiware-tech-help mailing list
        Fiware-tech-help@lists.fiware.org
        https://lists.fiware.org/listinfo/fiware-tech-help

        Show
        fw.ext.user FW External User added a comment - The deployment started several months ago, this is a very ambitious task. We are in the latest steps, most probably it will be ready after the Xmas holidays. Regards, Francisco El 16/12/15 8:38, "Manuel Escriche (JIRA)" <jira-help-desk@fi-ware.org> ________________________________ Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción. The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it. Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost. Please, send your messages using the new domain (Fiware-tech-help@lists.fiware.org) instead of the old one. _______________________________________________ Fiware-tech-help mailing list Fiware-tech-help@lists.fiware.org https://lists.fiware.org/listinfo/fiware-tech-help
        Hide
        frb Francisco Romero added a comment -

        Adding impediment: the user is waiting for the new cluster deployment.

        Show
        frb Francisco Romero added a comment - Adding impediment: the user is waiting for the new cluster deployment.
        Hide
        fw.ext.user FW External User added a comment -

        Hi all,

        Regarding this issue, it must be said the new cluster for storage is
        finally working. This new cluster is based on a modern Hadoop and the
        problem regarding offsets in WebHDFS should be fixed.

        As said, this cluster is only suited for storage, not computing (a second
        cluster specifically addressing computing will be released in a few
        weeks). So, if your use case is about storing data in HDFS and consuming
        it through WebHDFS, with no Hive, no MapReduce, then you are ready to move
        from the ³old² cluster (cosmos.lab.fiware.org) to the new one
        (storage.cosmos.lab.fiware.org). If you are interested, let me now and
        I¹ll provision the account and I¹ll migrate the data from one cluster to
        the other.

        Regards,
        Francisco

        El 16/12/15 8:56, "FRANCISCO ROMERO BUENO"

        ________________________________

        Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción.

        The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it.

        Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição
        Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost.
        Please, send your messages using the new domain (Fiware-tech-help@lists.fiware.org) instead of the old one.
        _______________________________________________
        Fiware-tech-help mailing list
        Fiware-tech-help@lists.fiware.org
        https://lists.fiware.org/listinfo/fiware-tech-help

        Show
        fw.ext.user FW External User added a comment - Hi all, Regarding this issue, it must be said the new cluster for storage is finally working. This new cluster is based on a modern Hadoop and the problem regarding offsets in WebHDFS should be fixed. As said, this cluster is only suited for storage, not computing (a second cluster specifically addressing computing will be released in a few weeks). So, if your use case is about storing data in HDFS and consuming it through WebHDFS, with no Hive, no MapReduce, then you are ready to move from the ³old² cluster (cosmos.lab.fiware.org) to the new one (storage.cosmos.lab.fiware.org). If you are interested, let me now and I¹ll provision the account and I¹ll migrate the data from one cluster to the other. Regards, Francisco El 16/12/15 8:56, "FRANCISCO ROMERO BUENO" ________________________________ Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción. The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it. Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost. Please, send your messages using the new domain (Fiware-tech-help@lists.fiware.org) instead of the old one. _______________________________________________ Fiware-tech-help mailing list Fiware-tech-help@lists.fiware.org https://lists.fiware.org/listinfo/fiware-tech-help
        Hide
        mev Manuel Escriche added a comment -

        Any update? still impeded? Thanks

        Show
        mev Manuel Escriche added a comment - Any update? still impeded? Thanks
        Hide
        fw.ext.user FW External User added a comment -

        Hi all,

        This issue may be considered fixed once the new storage cluster has been
        announced. I¹m copying the announce:

        ‹‹‹‹‹

        Dear FIWARE Lab user,
        Few months ago our Big Data team started working on deploying a new global
        instance of Cosmos. The current "old" one is reaching its limit, both in
        storage and computing capabilities, so moving to a large and powerful
        cluster was a need.
        Regarding this new instance, we have decided to split storage and
        computing into two different clusters. That's why I'm glad to announce the
        storage one is ready to be used. Nevertheless, some remarks must be done:

        • It must be considered as a beta. Errors and missconfigurations may still
          arise.
        • It is only suited for storage purposes. The software stack only
          comprises HDFS, not MapReduce nor any other computing tool.
        • No ssh access has been enabled. From the current "old" cluster we have
          learn most of people used the ssh access to install their own software or
          allocating vast ammounts of data in the local file system, which is a very
          bad behaviour since the clusters are always shared. Thus, the only allowed
          access is through WebHDFS REST API, which is protected by OAuth2.

        This cluster, and the upcoming computing one, will replace the current
        "old" cluster in the short time, based on a modern Hadoop distribution.
        Nevertheless, for now it is not mandatory to use it. Moving to this
        cluster is only recommended:

        • If you are interested on pure storage (i.e. no analytics on the data,
          for instance, if you are using Cygnus for building historics about Orion's
          context data)
        • and, as much, the consumption of your data is done through WebHDFS REST
          API.

        It must be said those of you moving right now to this cluster will enjoy a
        larger HDFS quota. If you are interested, please sent an email to
        francisco.romerobueno@telefonica.com
        Any doubt you may have, do not hesitate to contact us.
        Best regards,
        Cosmos Big Data team

        El 17/2/16 10:51, "Manuel Escriche (JIRA)" <jira-help-desk@fi-ware.org>

        ________________________________

        Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción.

        The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it.

        Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição
        Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost.
        Please, send your messages using the new domain (Fiware-tech-help@lists.fiware.org) instead of the old one.
        _______________________________________________
        Fiware-tech-help mailing list
        Fiware-tech-help@lists.fiware.org
        https://lists.fiware.org/listinfo/fiware-tech-help

        Show
        fw.ext.user FW External User added a comment - Hi all, This issue may be considered fixed once the new storage cluster has been announced. I¹m copying the announce: ‹‹‹‹‹ Dear FIWARE Lab user, Few months ago our Big Data team started working on deploying a new global instance of Cosmos. The current "old" one is reaching its limit, both in storage and computing capabilities, so moving to a large and powerful cluster was a need. Regarding this new instance, we have decided to split storage and computing into two different clusters. That's why I'm glad to announce the storage one is ready to be used. Nevertheless, some remarks must be done: It must be considered as a beta. Errors and missconfigurations may still arise. It is only suited for storage purposes. The software stack only comprises HDFS, not MapReduce nor any other computing tool. No ssh access has been enabled. From the current "old" cluster we have learn most of people used the ssh access to install their own software or allocating vast ammounts of data in the local file system, which is a very bad behaviour since the clusters are always shared. Thus, the only allowed access is through WebHDFS REST API, which is protected by OAuth2. This cluster, and the upcoming computing one, will replace the current "old" cluster in the short time, based on a modern Hadoop distribution. Nevertheless, for now it is not mandatory to use it. Moving to this cluster is only recommended: If you are interested on pure storage (i.e. no analytics on the data, for instance, if you are using Cygnus for building historics about Orion's context data) and, as much, the consumption of your data is done through WebHDFS REST API. It must be said those of you moving right now to this cluster will enjoy a larger HDFS quota. If you are interested, please sent an email to francisco.romerobueno@telefonica.com Any doubt you may have, do not hesitate to contact us. Best regards, Cosmos Big Data team El 17/2/16 10:51, "Manuel Escriche (JIRA)" <jira-help-desk@fi-ware.org> ________________________________ Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción. The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it. Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost. Please, send your messages using the new domain (Fiware-tech-help@lists.fiware.org) instead of the old one. _______________________________________________ Fiware-tech-help mailing list Fiware-tech-help@lists.fiware.org https://lists.fiware.org/listinfo/fiware-tech-help

          People

          • Assignee:
            frb Francisco Romero
            Reporter:
            fw.ext.user FW External User
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: