Uploaded image for project: 'Help-Desk'
  1. Help-Desk
  2. HELP-7721

[Fiware-lab-help] SpainTenerife node - Incorrect disk ocupation values reported on Infographics

    Details

    • Type: extRequest
    • Status: Closed
    • Priority: Major
    • Resolution: Done
    • Fix Version/s: 2021
    • Component/s: FIWARE-LAB-HELP
    • Labels:
      None

      Description

      Hello,

      Info shown on infographics for Spain Tenerife node seems incorrect. It shows 100% disk occupancy, which is wrong, and actually the numerical values shown on the clocks are incorrect. All our computing nodes are mirrors of each other, and each one is showing a different value.

      In fact, we recently increased our nodes capacity to 400GB each, and this change is neither reflected on infographics.

      Could you please check?

      Thanks!

      This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it.
      As its integrity cannot be secured on the Internet, the Atos group liability cannot be triggered for the message content. Although the sender endeavors to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.

      Este mensaje y los ficheros adjuntos pueden contener información confidencial destinada solamente a la(s) persona(s) mencionadas anteriormente y pueden estar protegidos por secreto profesional.
      Si usted recibe este correo electrónico por error, gracias por informar inmediatamente al remitente y destruir el mensaje.
      Al no estar asegurada la integridad de este mensaje sobre la red, Atos no se hace responsable por su contenido. Su contenido no constituye ningún compromiso para el grupo Atos, salvo ratificación escrita por ambas partes.
      Aunque se esfuerza al máximo por mantener su red libre de virus, el emisor no puede garantizar nada al respecto y no será responsable de cualesquiera daños que puedan resultar de una transmisión de virus.

      Since January 1st, old domains won't be supported and messages sent to any domain different to @lists.fiware.org will be lost.
      Please, send your messages using the new domain (Fiware-lab-help@lists.fiware.org) instead of the old one.
      _______________________________________________
      Fiware-lab-help mailing list
      Fiware-lab-help@lists.fiware.org
      https://lists.fiware.org/listinfo/fiware-lab-help
      [Created via e-mail received from: "Gil Munoz, Roberto" <roberto.gil@atos.net>]

        Activity

        Hide
        fw_support_st Atos Spain Tenerife Node Support Team added a comment -

        Hello,

        Yes, actually that is the case. /var/lib/nova/ partition hosts the machines. we have aound 170 GB on the partition. It seems to be the case the the value reported is the actual one, because there are thin-provisioned machines adding up to 420GB, which explains what is being reported to the monitoring.

        I think this ticket can be closed, as the value is being reported by our backend and it is in fact correct. We will run an internal research to find out where the overprovisioning fits into the picture, as it doesn't seem to be configured in our system by default

        Show
        fw_support_st Atos Spain Tenerife Node Support Team added a comment - Hello, Yes, actually that is the case. /var/lib/nova/ partition hosts the machines. we have aound 170 GB on the partition. It seems to be the case the the value reported is the actual one, because there are thin-provisioned machines adding up to 420GB, which explains what is being reported to the monitoring. I think this ticket can be closed, as the value is being reported by our backend and it is in fact correct. We will run an internal research to find out where the overprovisioning fits into the picture, as it doesn't seem to be configured in our system by default
        Hide
        jicg José Ignacio Carretero Guarde added a comment -

        Any responses? Can we close the ticket?

        Show
        jicg José Ignacio Carretero Guarde added a comment - Any responses? Can we close the ticket?
        Hide
        jicg José Ignacio Carretero Guarde added a comment -

        It would be nice if you provided the output of "df -f" command for any of your compute nodes.

        Show
        jicg José Ignacio Carretero Guarde added a comment - It would be nice if you provided the output of "df -f" command for any of your compute nodes.
        Hide
        danieles Daniele Santoro added a comment -

        on /var/lib/nova/instances if I'm not wrong. Can you confirm Atos Spain Tenerife Node Support Team ?

        Show
        danieles Daniele Santoro added a comment - on /var/lib/nova/instances if I'm not wrong. Can you confirm Atos Spain Tenerife Node Support Team ?
        Hide
        jicg José Ignacio Carretero Guarde added a comment -

        Where are those 400Gb disk mounted in the compute-nodes?

        Show
        jicg José Ignacio Carretero Guarde added a comment - Where are those 400Gb disk mounted in the compute-nodes?
        Hide
        danieles Daniele Santoro added a comment -

        Ok I agree, anyway I just need the confirmation that the numbers about disk space from the command "nova host-describe" we analysed are dependant from that value. So maybe you could:

        1) Launch the command "nova host-describe NODEx"
        2) Change the replication factor value and restart nova
        3) Launch again the command "nova host-describe NODEx"

        see if those numbers changes and give me confirmation.

        Then we can close this ticket and I will plan the implementation of a new feature in the monitoring system in order to support this disk replication factor. Once ready we could test and tune the value for your region.

        Sounds good ?

        Show
        danieles Daniele Santoro added a comment - Ok I agree, anyway I just need the confirmation that the numbers about disk space from the command "nova host-describe" we analysed are dependant from that value. So maybe you could: 1) Launch the command "nova host-describe NODEx" 2) Change the replication factor value and restart nova 3) Launch again the command "nova host-describe NODEx" see if those numbers changes and give me confirmation. Then we can close this ticket and I will plan the implementation of a new feature in the monitoring system in order to support this disk replication factor. Once ready we could test and tune the value for your region. Sounds good ?
        Hide
        fw_support_st Atos Spain Tenerife Node Support Team added a comment -

        Hi, Daniele

        Yes, I have been checking this point. In our nova.conf configuration files, I haven't found the disk_allocation_ratio parameter, which makes me think either default value is not 1, or the problem lies somewhere else.

        In order to push this forward, this is my proposal: I am going to force 1,5 as disk_allocation_ratio on the controllers, and then you can patch the monitoring observing that actually the value could be different than 1. That way we can check if the values match afterwards

        Show
        fw_support_st Atos Spain Tenerife Node Support Team added a comment - Hi, Daniele Yes, I have been checking this point. In our nova.conf configuration files, I haven't found the disk_allocation_ratio parameter, which makes me think either default value is not 1, or the problem lies somewhere else. In order to push this forward, this is my proposal: I am going to force 1,5 as disk_allocation_ratio on the controllers, and then you can patch the monitoring observing that actually the value could be different than 1. That way we can check if the values match afterwards
        Hide
        danieles Daniele Santoro added a comment -

        I did some inspection on the monitoring API code and it confirms that we do not consider “disk_allocation_ratio”, or better, we consider 1 as the default. Therefore if you have a different value set for your region the disk space reported by the monitoring API is wrong.

        Atos Spain Tenerife Node Support Team: Have you had the chance to check which is the value of that configuration parameter on your region ? As specified in table 3.52 here: https://goo.gl/5WRrkf it should be present in the nova.conf file on your controllers.

        I need this information in order to confirm the cause of the issue, then I can start the development of a new feature in the monitoring system in order to support this region customisation.

        Show
        danieles Daniele Santoro added a comment - I did some inspection on the monitoring API code and it confirms that we do not consider “disk_allocation_ratio”, or better, we consider 1 as the default. Therefore if you have a different value set for your region the disk space reported by the monitoring API is wrong. Atos Spain Tenerife Node Support Team : Have you had the chance to check which is the value of that configuration parameter on your region ? As specified in table 3.52 here: https://goo.gl/5WRrkf it should be present in the nova.conf file on your controllers. I need this information in order to confirm the cause of the issue, then I can start the development of a new feature in the monitoring system in order to support this region customisation.
        Hide
        danieles Daniele Santoro added a comment -

        This issue is related with OPS-1445. Status can be tracked also from it.

        Show
        danieles Daniele Santoro added a comment - This issue is related with OPS-1445. Status can be tracked also from it.
        Hide
        danieles Daniele Santoro added a comment -

        Dear, can we try to go ahead on this ticket ? We have planned to solve it on the next Sprint 6.2.2. Please provide required information.

        Show
        danieles Daniele Santoro added a comment - Dear, can we try to go ahead on this ticket ? We have planned to solve it on the next Sprint 6.2.2. Please provide required information.
        Hide
        danieles Daniele Santoro added a comment -

        Dear operators, any news about my last request sent via mail ?

        Show
        danieles Daniele Santoro added a comment - Dear operators, any news about my last request sent via mail ?
        Hide
        danieles Daniele Santoro added a comment -

        Monitoring chain has been fixed and task is back to progress state, I'm waiting detailed information from the node operators.

        Show
        danieles Daniele Santoro added a comment - Monitoring chain has been fixed and task is back to progress state, I'm waiting detailed information from the node operators.
        Hide
        danieles Daniele Santoro added a comment -

        We currently have a problem on the monitoring chain which prevents Infographics to show latest monitored data from the regions. For this reason I cannot verify your problem right now but I will take a look once the issue with the monitoring system will be resolved.

        Show
        danieles Daniele Santoro added a comment - We currently have a problem on the monitoring chain which prevents Infographics to show latest monitored data from the regions. For this reason I cannot verify your problem right now but I will take a look once the issue with the monitoring system will be resolved.
        Hide
        danieles Daniele Santoro added a comment -

        Dear,

        I will take a look at this issue and I will update you as soon as possible.

        Daniele

        Show
        danieles Daniele Santoro added a comment - Dear, I will take a look at this issue and I will update you as soon as possible. Daniele
        Hide
        ichulani ilknur chulani added a comment -

        Hi Daniele,

        It seems Pablo forgot to assign the ticket to you. Could you kindly take a look? We would really like to sort this issue out soon.

        Thanks in advance for your help.

        Regards,

        ilknur

        Show
        ichulani ilknur chulani added a comment - Hi Daniele, It seems Pablo forgot to assign the ticket to you. Could you kindly take a look? We would really like to sort this issue out soon. Thanks in advance for your help. Regards, ilknur
        Hide
        pra Pablo Rodríguez Archilla added a comment - - edited

        Well,

        This issue falls between 3 different components (and maintainers): Kasia Di Meo would be the right person for visualization issues, Daniele Santoro for wrong aggregated information at Monitoring API, and me for the rest of components of Monitoring.

        I think that, in this case, we should assign the ticket to Daniele and he will contact the rest in order to solve this known problem which involves different components.

        Regards.

        Show
        pra Pablo Rodríguez Archilla added a comment - - edited Well, This issue falls between 3 different components (and maintainers): Kasia Di Meo would be the right person for visualization issues, Daniele Santoro for wrong aggregated information at Monitoring API, and me for the rest of components of Monitoring. I think that, in this case, we should assign the ticket to Daniele and he will contact the rest in order to solve this known problem which involves different components. Regards.
        Hide
        ichulani ilknur chulani added a comment -

        Dear Pablo,

        I am assigning this issue to you as I believe you are the maintainer of the Infographic monitoring tools. Could you kindly take a look if there is a defect in the tool, or a missing/incorrect configuration on our part, etc?

        Thanks,

        ilknur

        Show
        ichulani ilknur chulani added a comment - Dear Pablo, I am assigning this issue to you as I believe you are the maintainer of the Infographic monitoring tools. Could you kindly take a look if there is a defect in the tool, or a missing/incorrect configuration on our part, etc? Thanks, ilknur
        Hide
        WIGNER Budapest Node Helpdesk added a comment -

        The issue has been emailed:

        • Time sent: 17/Nov/16 1:07 PM
        • To: roberto.gil@atos.net
        • with subject: *(HELP-7721) [Fiware-lab-help] SpainTenerife node - Incorrect disk ocupation values reported on Infographics *

        Dear Roberto,

        I forwarded your problem to the Spain Tenerife Support Team.

        Best Regards,
        Gergő

        Show
        WIGNER Budapest Node Helpdesk added a comment - The issue has been emailed: Time sent: 17/Nov/16 1:07 PM To: roberto.gil@atos.net with subject: *( HELP-7721 ) [Fiware-lab-help] SpainTenerife node - Incorrect disk ocupation values reported on Infographics * Dear Roberto, I forwarded your problem to the Spain Tenerife Support Team. Best Regards, Gergő

          People

          • Assignee:
            spain.node Spain Node Support Team
            Reporter:
            fw.ext.user FW External User
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: