Uploaded image for project: 'Help-Desk'
  1. Help-Desk
  2. HELP-988

FIWARE.Request.Tech.Cloud.ObjectStorage.StorageContainerStoredFileSize

    Details

    • Type: extRequest
    • Status: Closed
    • Priority: Major
    • Resolution: Done
    • Fix Version/s: 2021
    • Component/s: FIWARE-TECH-HELP
    • Labels:
      None
    • HD-Chapter:
      Cloud
    • HD-Enabler:
      Object Storage

      Description

      Hello,

      We are verifying across several uploaded files of varying sizes and types that the size of the stored file vs the uploaded one suffers an increase in size of about 1,27 times (a file with an original size of 375MB results in a stored file with 477MB).
      For small files this is not a big issue but taking into account that large data files could be uploaded following this size increase ratio, a file with an original size of 5GB would result in 5,8GB, which is almost 1GB of increased size.
      This results in a poor utilisation of the cloud storage space and there is space for optimisation.

      Browser: Safari 8.0
      OS: Mac OS X 10.10

        Activity

        Hide
        meth Kalman Meth added a comment -

        Can the reporter please add additional context to this issue?
        What is your computing environment? Specifically, what Object Store is being used?
        How can you tell the file stored is 477MB instead of 375MB?
        How can someone reproduce this test?
        Thanks.

        Show
        meth Kalman Meth added a comment - Can the reporter please add additional context to this issue? What is your computing environment? Specifically, what Object Store is being used? How can you tell the file stored is 477MB instead of 375MB? How can someone reproduce this test? Thanks.
        Hide
        pmbento Patrício Batista (Inactive) added a comment -

        When we were testing this we used the Spain server. We created a container and uploaded some objects to the container. Try to do this yourself and verify that the file size increases as stated above.

        Show
        pmbento Patrício Batista (Inactive) added a comment - When we were testing this we used the Spain server. We created a container and uploaded some objects to the container. Try to do this yourself and verify that the file size increases as stated above.
        Hide
        jmk John M. Kennedy (Inactive) added a comment -

        Hello. I have been looking into this. You may be referring to size that is displayed in the FIWARE lab portal. The portal encodes the data you upload using base 64 to facilitate transferring binary information via a text-based protocol. This encoding scheme does have an overhead as described here: http://en.wikipedia.org/wiki/Base64. As you note, the file you receive when you download from the portal is returned to the original size.

        A side note. If your concern is more back-end efficiency you may be interested to know that Object Storage GE uses OpenStack Swift as the back end. To deliver redundancy OpenStack Swift makes copies (typically 3 copies total) of whatever is being stored. The level of redundancy can be configured on each OpenStack Swift deployment. Storage policies have recently been introduced to support different redundancy and other policies for different objects in openstack swift...see http://www.openstack.org/blog/2014/07/openstack-swift-2-0-released-and-storage-policies-have-arrived/. This will facilitate erasure codes - developers are currently working on optimised schemes whereby the same levels of redundancy can be accommodated in half the space. See https://swiftstack.com/blog/2013/07/10/erasure-codes-with-openstack-swift/. As FIWARE infrastructure is updated it will be able to leverage these improvements and reduce the back end costs of storing objects.

        Show
        jmk John M. Kennedy (Inactive) added a comment - Hello. I have been looking into this. You may be referring to size that is displayed in the FIWARE lab portal. The portal encodes the data you upload using base 64 to facilitate transferring binary information via a text-based protocol. This encoding scheme does have an overhead as described here: http://en.wikipedia.org/wiki/Base64 . As you note, the file you receive when you download from the portal is returned to the original size. A side note. If your concern is more back-end efficiency you may be interested to know that Object Storage GE uses OpenStack Swift as the back end. To deliver redundancy OpenStack Swift makes copies (typically 3 copies total) of whatever is being stored. The level of redundancy can be configured on each OpenStack Swift deployment. Storage policies have recently been introduced to support different redundancy and other policies for different objects in openstack swift...see http://www.openstack.org/blog/2014/07/openstack-swift-2-0-released-and-storage-policies-have-arrived/ . This will facilitate erasure codes - developers are currently working on optimised schemes whereby the same levels of redundancy can be accommodated in half the space. See https://swiftstack.com/blog/2013/07/10/erasure-codes-with-openstack-swift/ . As FIWARE infrastructure is updated it will be able to leverage these improvements and reduce the back end costs of storing objects.
        Hide
        meth Kalman Meth added a comment -

        Forwarded to John Kennedy to address.

        Show
        meth Kalman Meth added a comment - Forwarded to John Kennedy to address.
        Hide
        jmk John M. Kennedy (Inactive) added a comment -

        Hello. To confirm - the Web Portal displays a larger file size than the original because the Web Portal has encoded the original file using Base64 encoding, before it passes it to the Object Storage GE. Base64 encoding increases a file size - see previous comment in thread.

        Binary files can also be sent directly to the Object Storage GE - they will be stored with the original size. However, as mentioned previously several copies of the content will be stored on the back-end for redundancy.

        I believe this ticket can be closed. Another possibility is to investigate if the portal can be re-engineered to avoid the need to Base64 encode. This may not be trivial depending on the support provided by the underlying technologies the portal uses.

        Thank you - John

        Show
        jmk John M. Kennedy (Inactive) added a comment - Hello. To confirm - the Web Portal displays a larger file size than the original because the Web Portal has encoded the original file using Base64 encoding, before it passes it to the Object Storage GE. Base64 encoding increases a file size - see previous comment in thread. Binary files can also be sent directly to the Object Storage GE - they will be stored with the original size. However, as mentioned previously several copies of the content will be stored on the back-end for redundancy. I believe this ticket can be closed. Another possibility is to investigate if the portal can be re-engineered to avoid the need to Base64 encode. This may not be trivial depending on the support provided by the underlying technologies the portal uses. Thank you - John

          People

          • Assignee:
            meth Kalman Meth
            Reporter:
            pmbento Patrício Batista (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: