Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Businesses Cloud

Ask Slashdot: Cloud Service On a Budget? 121

First time accepted submitter MadC0der writes "We just signed a project with a very large company. We are a computer vision based company and our project gathers images from a facility from PA. Our company is located in TN. The company we're gather images from is on a very high speed fiber optic network. However, being a small company of 11 developers, and 1 systems engineer, we're on a business class 100mb cable connection which works well for us but not in this situation. The information gathered from the client in PA is s 1½mb .bmp image, along with a 3mb Depth map file, making each snapshot a little under 5 megs. This may sound small, but images are taken every 3-5 seconds. This can lead to a very large amount of data captured and transferred each day. Our facility is incapable of handling such large transfers without effecting internal network performance. We've come to the conclusion that a cloud service would be the best solution for our problem. We're now thinking the customer's workstation will sync the data with the cloud, and we can automate pulling the data during off hours so we won't encounter congestion for analysis. Can anyone help suggest a stable, fairly price cloud solution that will sync large amounts for offsite data for retrieval at our convenience (nightly Rsync script should handle this process)?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Cloud Service On a Budget?

Comments Filter:
  • Bring your own server. Depending on the time frame/duration of the project, it might be more cost effective to rent a quarter or half rack in a datacenter and build/buy your own servers. High initial up front cost, but does save money in the long run.
    • Re: (Score:3, Funny)

      by BobC ( 101861 )

      Is the data being generated 24/7? If so, that's 432 GB/day, pretty much exactly 12 hours worth of your 100 Mbps bandwidth. So some spooling is needed, but why in the cloud? The main goal would seem to be avoiding paying twice to move the data, so you'd want to avoid using through a 3rd party if at all possible.

      1. The simplest solution would appear to be to put a laptop with a 500+ GB HD at their facility. A laptop because it essentially has a built-in UPS, and the CPU can sleep much of the time.

      2. Devel

      • Or instead of the cloud, pay for a faster internet connection.
        Or transport the data on HDs via Fedex if latency isn't a problem [I would use SSDs in a metal padded case knowing Fedex].

        • Re:BYOS (Score:5, Funny)

          by Chrisq ( 894406 ) on Thursday September 12, 2013 @04:13AM (#44827631)

          [I would use SSDs in a metal padded case knowing Fedex].

          Fedex is like UDP, an unreliable delivery service. In fact there is only one fault of UDP it does not duplicate. Things can arrive broken, out of order, delayed, or not at all but I have never heard of Fedex delivering multiple copies!

      • by orkim ( 238312 )

        Your math is off just a little bit there.

        60sec * 60min * 24hours = 86400 seconds in day

        86400 seconds / 5 seconds = 17280 pictures

        17280 pictures * 5 MB = 86400 MB

        Or roughly 84.5 GB which should lower the bar even more for that 100Mbps connection.

        • Yeah, furthermore; the summary lists these as being a bmp and a depth map (which I assume is also raster data like a bmp). Although I am sure they don't want to compress these to something like jpeg which is lossy, something lossless like tar/gzip before they are sent would probably cut that down by a good margin (20% is likely and depending on the data maybe 50% or more). At that point you are talking something on the order of 50 gb per day which shouldn't cause any problems at all if you throttle the tr

          • rsync can easily compress the stream when sending and decompress when receiving.

            This company could market their service as an appliance + service; have whatever computing and data storage power on-site, so only analysis would be sent over the network instead of the raw data.

    • Also, the data center badwidth bills will be very high. I would get a second Internet connection at the home office and keep the server in house. Cheaper all round.
  • by atari2600a ( 1892574 ) on Wednesday September 11, 2013 @11:34PM (#44826467)
    ...WHY are you using BMP in the first place? Does whatever you're generating these on not have the processing capability to compress to PNG before transferring? I mean it SOUNDS like it'd save 10-20% off the total transfer...Anyways, what I'd do is I'd simply plop a server rack at the source that takes all the images for a given hour or whatever, tar.gz.bz2.whatevers them & send them over. Otherwise, I mean, Amazon wouldn't be TERRIBLE?
    • by Anonymous Coward

      ...WHY are you using BMP in the first place? Does whatever you're generating these on not have the processing capability to compress to PNG before transferring? I mean it SOUNDS like it'd save 10-20% off the total transfer...Anyways, what I'd do is I'd simply plop a server rack at the source that takes all the images for a given hour or whatever, tar.gz.bz2.whatevers them & send them over. Otherwise, I mean, Amazon wouldn't be TERRIBLE?

      My (wild) guesses:
      * can't ask HD security cameras their customers are using (e.g. 4 "face recog") to waste time with compression
      * can't plop a server rack near any (and all) those security cameras, would make them too obvious.

    • by FishOuttaWater ( 1163787 ) on Thursday September 12, 2013 @12:05AM (#44826643)
      Yes, your first line of defense is to examine what they need as far as these images, and that will tell you how far you can go in reducing their size for transmission and storage. Can they be scaled down? Can they be lossy? Can you take some time to run a more effective lossless algorithm on them? Is there redundancy between images? Secondly, do you have to move the whole image? Can you do your work on a lower quality image to define the series of steps required and then apply those steps remotely at their location? Just think real hard about what the requirements are, and don't rush yourself. You may come up with your best ideas in the shower on this when you have time to think outside the box.
    • by Osgeld ( 1900440 )

      that's pretty much what we do in house, machine generating the image data stores locally and on schedule a server pulls the data, that server gets incrementally backed up offsite when no one is around.

      problem with having systems rely on a network to function is you can halt production or loose valuable data (cause that massive failure that got to the customers customer WILL happen while your connection is down)

    • by gerf ( 532474 )
      There are compression plug-ins for servers. He'll need a server on site as a buffer in case of a hiccup to the cloud at the very least. But if he's putting in a server, he can just do it all himself anyway.
  • Sounds like... (Score:4, Interesting)

    by cultiv8 ( 1660093 ) on Wednesday September 11, 2013 @11:34PM (#44826471) Homepage
    the sales guy oversold your capabilities. Instead of asking about cloud options, why don't you just pick a server host with a good reputation (Amazon and Rackspace come to mind) and pass the costs onto the client?
    • Or that the business and product owners under-priced the monthly contract with the client.

      And what the heck does your internal network have anything to do with the performance of your product? Separate your general business network from your server network if not for performance or HIPPA, but for the day when one of your developers or unpatched machines do something to DoS your business.

      Also, you might want to read up on MTU [wikipedia.org]. Large file transfers might be better served with an MTU larger than 1500.

      • MTU is enforced at the hardware layer, which he does not have total control over.

        Better to change the window sizing, and use HPNSSH which will allow you set the window sizing. The developer of SSH made the decision that

        Of course you have to own both sides to do that as well, and make changes to the stack... royal pain in the arse.

        Easier to install WAN accelerators like Silver Peak, CISCO WAAS, or Riverbed Steelhead devices to get better performance and compression.

    • by klubar ( 591384 )

      I have to agree that the host/server/bandwidth costs should be a relatively small factor on your calculation. Reliability, security and responsiveness really should be more important. The difference between top tier and bottom tier hosting/cloud is probably no more than a factor of 2 -- you can easily burn thru that savings with a couple of hours of downtime or a hosting vendor screw up.

      If cost is really important, I'd get it working first at a top tier vendor and then overtime try to squeeze out costs--eit

  • by duckgod ( 2664193 ) on Wednesday September 11, 2013 @11:38PM (#44826499)
    Assuming you don't need real time analysis(doesn't look like it from problem description). Send a couple 500gb hard drives and have someone mail you the daily load of images each day with overnight shipping.
    • by Anonymous Coward

      This is actually a good solution.

      I am informed that google does this regularly to save money transmitting large volumes of data.

    • by Resol ( 950137 )
      Yep, in school (long ago) there was an old adage -- "never underestimate the bandwidth of a semi full of mag tapes". Sure the latency is high, but in many cases, not an issue!
    • by cbope ( 130292 )

      Shuttling a couple hard disks back and forth every day of the week using overnight shipping would be a fairly expensive option. You would have to have at minimum 2 sets of disks, sending them both ways every day, the shipping costs alone would be high if you do this on a daily basis. We are talking 2x the daily overnight shipping costs for a 2 pound package, multiplied by an average 21 working days/month. I don't know what the typical costs for overnight shipping in the US these days, but let's say $25 per

      • by n7ytd ( 230708 )

        Shuttling a couple hard disks back and forth every day of the week using overnight shipping would be a fairly expensive option. You would have to have at minimum 2 sets of disks, sending them both ways every day, the shipping costs alone would be high if you do this on a daily basis. We are talking 2x the daily overnight shipping costs for a 2 pound package, multiplied by an average 21 working days/month. I don't know what the typical costs for overnight shipping in the US these days, but let's say $25 per shipment and $50/day. The monthly shipping costs work out to be $1050. And that does not include all the "manual" labor of copying data to/from the disks, packing, shipping paperwork, etc. The cost of the disks would be fairly trivial in comparison to the shipping costs.

        Also, you would likely want a larger pool of disks to spread the failure rate, as all the bumps and shocks they receive every day being shipped back and forth is very likely to result in damage and short lifespan.

        I totally agree on this method for one-time or infrequent large transfers, but I think you are creating more problems by trying to use this method for daily transfer of data.

        You are correct about the logistics problems of this approach, but there are ways to make this more cost effective. As another poster mentioned, SD cards would be a good way to reduce the weight and fragility of the package. LTO tapes would be another option. If the OP can accept another day of latency, the Postal Service offers flat Priority Mail boxes that could ship 2-day instead of overnight for a little over $5. By investing in some more media to add a buffer to the system, the return trip could b

    • If a bigger pipe is too expensive, overnight shipping of a hard drive every day is going to be WAY too expensive.

  • by Anonymous Coward

    Assuming 5MB of data every 5 seconds, you're dealing with ~90GB of data a day. So, looking at Amazon's pricing model (http://aws.amazon.com/s3/pricing/), assuming you delete the data after you pull it, the storage total should be in the range of $0.095 * 90GB = $8.55/mo. Transfers into S3 are free. You'll be transfering ~2.7TB/mo out (90GB*30), at $0.120/GB, that's $324.00/mo in transfer fees.

    Now, if that data isn't being accumulated 24/7 (ie. if it's only 8/5 for example), that lowers your monthly fees

    • by DeSigna ( 522207 )

      Alternatively, an $80/mo Linode (or similar) plan would cache 2 days of data (~200GB storage), offer some capacity to 'cook' it a bit before re-downloading (say, do some compression) and have enough transfer (8TB/mo) all in one shiny package. For pure storage, I think Dropbox and similar AWS-hosted services weigh in around the $60/mo mark at what would be needed.

      Personally, I would spend money on an additional, dedicated Internet connection or (better) WAN tail to the customer and drop some staging hardware

    • Bittorrent Sync [bittorrent.com] is exactly what you're looking for.

      I just setup this same thing to backup all my photos. I was bouncing between rsync, samba and other random different programs. I wanted something to sync between numerous different computers and off site.

      Bittorrent sync solved all of this. It's almost as if they planned for people using it the way I am. In addition to having Mac and Windows clients. They also have

      • Linux ARM
      • Linux PowerPC
      • Linux i386
      • Linux x64
      • Linux PPC QorIQ
      • Linux_i386 (glibc 2.3)
      • Linux_x64 (glibc
  • "Who" is paying for the stream of pics of such quality and via a "very high speed fiber optic network"
    eg. If you are counting wildlife, ask the gov/state for more hardware.
    Cash might be very tight but gov data storage options should be usable.
    Is it OCR on cars? Changes in activity around buildings?
    If the "facility" has the need and cash to pay for images to be taken, optical and your work - ask for more cheap, fast storage.
    As for the "cloud" and the nature of your work be aware that the US and a few
  • by Anonymous Coward

    There's always Egnyte (https://www.egnyte.com/)

    They're not very expensive and they offer what they call an "ELC" (enterprise local cloud) or "OLC" (office local cloud). The way it works is you store the files in their datacenter and you can use their elc/olc clients effectively as a caching mechanism that is sync'd with cloud contents. This happens in such a way that anyone in your office/datacenter can access files from a common interface/api without having to saturate your 100meg pipe by fetching the same

    • by odgrim ( 3065827 )
      +to egnyte. Full disclosure, I work there.
    • There's always Egnyte (https://www.egnyte.com/)

      They're not very expensive and they offer what they call an "ELC" (enterprise local cloud) or "OLC" (office local cloud). The way it works is you store the files in their datacenter and you can use their elc/olc clients effectively as a caching mechanism that is sync'd with cloud contents. This happens in such a way that anyone in your office/datacenter can access files from a common interface/api without having to saturate your 100meg pipe by fetching the same file multiple times.

      This is actually the solution I'm looking at now. Plus, I like the fact they have an API we can hook in to. On a side note: I'm very surprised by the immaturity of the responses from a lot of the slashdot community.

  • That huge bandwidth is a major load requirement of the project. That bandwidth is going to cost you or your client too much money. I think you should simply look into separating the functionality so you can do the analysis on customer site, and you only "get"(pulling from db, webservice, or a rss feed) the analysis results right there on customer's site, and the rest of your application sit where it is now. From the sounds of it the images are first saved somewhere on customer's network, so perhaps it is no
  • by brad-x ( 566807 ) <brad@brad-x.com> on Thursday September 12, 2013 @12:11AM (#44826673) Homepage
    if you're going to sync nightly anyway, why bother with a cloud service? just sync at night.
    • by 6of9 ( 111061 )

      +1

      According to my calcs, you've got bandwidth to spare to complete the transfer overnight. If for some reason you do need to do this during the day, find a competent network engineer to implement QoS on your network so your VoIP/pr0n/WoW doesn't suffer.

      • by AHuxley ( 892839 )
        I get the feeling they have to work on or respond to one frame in near real time? What products produce the number of and that type of file 24/7?
        • My first thought was scameras / licence plate cameras at intersections, etc. I hope it's not something malicious like that!

          • by AHuxley ( 892839 )
            Yes my thought too - why is the facility unfriendly for local processing? A prison, super fund site, in the ocean, a blimp, some size/heat issues?
            What produces images so fast and a that depth? Computer animation work done extra cheap?
            The optical part is ? too. That does not sound best effort average telco loop cheap. The term analysis but at their convenience?
    • From looking over the specs, my guess is that the new customer is pushing the data to them and
      it is slowing down the office line while people are there working so they want to be able to allow
      their new customer to push it somewhere else and then they can download it at night when their
      office isn't using the bandwidth for day to day operations.
      Some possibly better solutions in decending order:
      1) switch new customer from pushing to allowing you to pull from them. i.e. ask them to cache it somewhere.
      2) instal

  • You mention rsync etc.... is this really necessary? From your description it sounds like the biggest potential cost for you is going to be network(followed by storage), but depending on where you are using the data the charges can vary wildly. For instance, incoming traffic from the internet TO Amazon is actually free, but outgoing is not free. If you really want to save money you are probably better off actually doing your processing in the cloud as well. Otherwise those bandwidth charges are going to
  • The NSA will steal your photos. Unless your 'vision based company' is doing some shifty security work for the NSA. In that case, you're fine and have a ridiculous budget so this post doesn't apply to you.
  • Since you are just spooling, that should be more than adequate.

    (Not sure how someone else calculated 432GB/day, and I am horrified by the suggestion to overnight mail hard drives - way too expensive.)

  • by bill_mcgonigle ( 4333 ) * on Thursday September 12, 2013 @01:09AM (#44826965) Homepage Journal

    You're probably going to pay less for a second cable modem line than you will to store that much data in the cloud. Cloud processing is fairly cheap - cloud storage is expensive.

    And then you won't have to re-tool anything else in your processes, except maybe adding another route or two. If you're doing that much data processing, the $200/mo for the line shouldn't really be a huge expense on the contract.

    If you're looking to scale out this service to lots of companies, then the calculus might be different.

  • Amazon has a low-cost version of S3 called Glacier, the downside of which is slow data retrieval time.

    Also, on the extremely unlikely chance you're using Apple, there's a solid tool called Arc which will front-end for Glacier, and add encryption and automation to boot.

  • It is more expensive than a cloud unless you are really big. Many startups that used to use Amazon's service decided with virtualization it was cheaper to use their own after they needed fiber connections and others to host massive bandwidth for all the boxens on the cloud.

    With 1/2 down your speed will be adversely affected. With VSphere is about $7,000 including a CentOS or Windows Server License and Windows Server 2012 with HyperV is the same price. You can host VMs and have data backed up elsewhere for r

    • by Chrisq ( 894406 )

      It is more expensive than a cloud unless you are really big.

      Or indeed really small, so you would only need a fraction of a server.

      • You can get a Lacie server/Network Appliance for $599.

        It includes a 5 disk raid and Linux running Samba or if you want $1099 for one with Windows Server Small Business Edition and some SSDs for a few hundred more after that if you need Windows specific stuff and more performance and ram.

        Fast 140 megs a second system that can serve as a print server as well. I see big enterprises use them to for small branch offices with 50 people where only a T1 is on the wan. These save network bandwidth too as they cache

  • by millisa ( 151093 ) on Thursday September 12, 2013 @04:41AM (#44827713)

    we're on a business class 100mb cable connection
    100mbps = 12mbyte/s (give up 15-20% for the packet overhead, 10megabytes/sec).

    Distilling that summary into the data that mattered:
    1.5mb image, 3mb file each under 5 megs.
    and
    images every 3-5 seconds

    The files are 5megabytes total.
    In a perfect world, they'd transfer in 0.5 seconds.

    Leaving 2.5 - 4.5 seconds for the porn.

    Let's assume they are the bigger size, 5megabytes, and they transfer in the more frequent number, every 3 seconds.
    5MBytes/3s = 1.66667 Mbytes/s = 13.33333 mbits/s.

    Why is a facility with a 100mb/s line incapable of handling this?
    How did a problem where a 100mb/s line can't handle 13.3333mb/s come to a conclusion of "Fix it with the cloud?"

    In any case, if you want to do a cloud setup, just about all of them will handle small 13.3mb/s constant rates and you'll pay for it more than if you figured out why your line isn't keeping up.

  • Storing 140 gigabytes a day is going to be expensive with any cloud service; you will essentially be using 4 Terabytes per month in bandwidth; as well as a lot of disk storage --- cloud providers charge dearly for this.

    You might be better off getting your local network's connection upgraded. Obviously; this has benefits beyond merely offloading storage.

    We're now thinking the customer's workstation will sync the data with the cloud, and we can automate pulling the data during off hours so we won't en

    • cloud providers charge dearly for this.

      Not necessarily. FranTech's BuyVM [frantech.ca] will rent you a KVM VM with 500GB of space and 5TB of bandwidth for only $15/month. If you need more, you can step up to their 1TB of space with 10TB of bandwidth plan that costs just $30/month. These plans are listed under Las Vegas - KVM Storage in the order form. It looks like they are out of stock right now (you can quickly check their stock here [doesbuyvmhavestock.com],) but I think they restock all of their plans on Mondays. BuyVM has a pretty good reputation and I have been enjoying th

  • I'd install a dedicated link and just add the cost of the link to the project's expense list.

    Sooner or later you're downloading the data, and most customers I've dealt with would have an issue with spooling their data to the cloud in the first place -- it's why they would have contracted a small firm to do the processing in the first place.

    Let's face it -- network capacity is just not that expensive nowadays, especially seeing as you sound like you're primarily interested in download speed, which means

  • If 11 developers and 1 sysadmin can't figure this one out, you should be all fired.
  • Hire a network consultant to fix your broken internet. After that's done have them figure out how you guys can scale. It's probably not a great idea to have to send all this stuff to you office. I am assuming your using GPU's those can be rented and/or bought. You probably want a system that can be distributed fairly well.

    The cloud is a buzzword not a product. A coloed 1ru server can hold about 40TB of bulk storage, Most colos will lets you use nearly unlimited inbound traffic (normal ratio is 1 to 10 i

    • Except it is not the internet:

      "Our facility is incapable of handling such large transfers without effecting internal network performance."

      Which should be even easier to fix. He is trying to fix the wrong problem.

      • Lol if 13mbs is affecting lan performance you have serious issues. Still using a 10mbs hub?

        • Well, I am going by what he said was the problem.

          I agree, if loading 100m from the internet breaks the internal network, something is very wrong.

  • There are several companies out there who do nothing but handle image processing "in the cloud". They could be used as simple bulk file transfers, or they might help solve the real problem â" dealing with large, uncompressed images.

    I know of two off the top of my head:

    • Cloudinary [cloudinary.com]: They will handle everything for you, including the storage and file conversions on the fly or via a pre-defined script.
    • Transloadit [transloadit.com]: They don't handle the storage, but instead interact with an Amazon S3 bucket you provide.
  • So 12mb/s (max) of transfers will bog down your 100mb/s connection so badly that you just cannot do it??? Uhm, are you sure about that???

    Well, OK then. Get another one.

  • The company we're gather images from ... ...without effecting internal network performance.

    I mean really... If you can't manage to write a coherent, error-free paragraph written in fairly simple SVO sentences or can't be bothered to proofread an article submission before posting, what makes you think that you could effectively manage a cloud-based infrastructure (or any other kind, for that matter)?

    Hell, with your skills just burn the files onto DVD's and toss them in the rubbish bin. It'll work just as we

    • You're without a doubt an absolute idiot! Why even bother taken the time to respond. One word: Troll.
  • If all you want is simple folder synchronization, (computer in TN writes a file to a folder, computer in PA downloads it 10-20 seconds later,) than you might want to look at EMC Syncplicity [syncplicity.com]. (I'm the desktop lead.)
  • by n7ytd ( 230708 ) on Thursday September 12, 2013 @04:45PM (#44834401)

    I don't understand what the issue is here. What the OP seems to be really asking is how to move the bandwidth requirement to overnight, when no one is using their connection for other business purposes.

    If time-shifting the syncing to off-hours is acceptable, why do you not install a server with a beefy hard drive at the client location to do just that?

    Have you explored the idea of compressing the data at the client side before sending it your way? Bitmaps often compress very well, especially if you can batch very similar ones together. A script to make a gzipped tar file every 5 minutes might do wonders for your data requirements.

    If you're ready to shell out the money for a cloud provider, why not instead shell out the money for a second connection to dedicate to this client?

    What does moving the data through a third party in "the cloud" offer over any of (or a combination of) these three approaches?

  • You might be able to utilize something like AWS S3 storage which is low cost for the storage but AWS will also charge you for I/O to/from S3. This can become very costly if you transfer alot of data into/out-of AWS S3.

    Remember with a Cloud provider you have to pay to transfer the data IN and to transfer the data OUT.

    Have you priced what a faster internet connection would cost you?
    Or a 2nd Internet connection just for this video traffic?
    Look beyond the Cable MSO's also, what is a FIOS based serv

"Protozoa are small, and bacteria are small, but viruses are smaller than the both put together."

Working...