Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Ask Slashdot: Cloud Service On a Budget?

samzenpus posted about a year ago | from the throw-a-little-money-at-it dept.

Businesses 121

First time accepted submitter MadC0der writes "We just signed a project with a very large company. We are a computer vision based company and our project gathers images from a facility from PA. Our company is located in TN. The company we're gather images from is on a very high speed fiber optic network. However, being a small company of 11 developers, and 1 systems engineer, we're on a business class 100mb cable connection which works well for us but not in this situation. The information gathered from the client in PA is s 1½mb .bmp image, along with a 3mb Depth map file, making each snapshot a little under 5 megs. This may sound small, but images are taken every 3-5 seconds. This can lead to a very large amount of data captured and transferred each day. Our facility is incapable of handling such large transfers without effecting internal network performance. We've come to the conclusion that a cloud service would be the best solution for our problem. We're now thinking the customer's workstation will sync the data with the cloud, and we can automate pulling the data during off hours so we won't encounter congestion for analysis. Can anyone help suggest a stable, fairly price cloud solution that will sync large amounts for offsite data for retrieval at our convenience (nightly Rsync script should handle this process)?

Sorry! There are no comments related to the filter you selected.

BYOS (2)

connor4312 (2608277) | about a year ago | (#44826435)

Bring your own server. Depending on the time frame/duration of the project, it might be more cost effective to rent a quarter or half rack in a datacenter and build/buy your own servers. High initial up front cost, but does save money in the long run.

Re:BYOS (3, Funny)

BobC (101861) | about a year ago | (#44826809)

Is the data being generated 24/7? If so, that's 432 GB/day, pretty much exactly 12 hours worth of your 100 Mbps bandwidth. So some spooling is needed, but why in the cloud? The main goal would seem to be avoiding paying twice to move the data, so you'd want to avoid using through a 3rd party if at all possible.

1. The simplest solution would appear to be to put a laptop with a 500+ GB HD at their facility. A laptop because it essentially has a built-in UPS, and the CPU can sleep much of the time.

2. Develop a relationship with whoever provides their bandwidth. Find the nearest peering point. Put a laptop there.

3. Get the NSA as a client, do some sysadmin work for them. Your data will be RIGHT THERE!

Re:BYOS (1)

davester666 (731373) | about a year ago | (#44827217)

Or instead of the cloud, pay for a faster internet connection.
Or transport the data on HDs via Fedex if latency isn't a problem [I would use SSDs in a metal padded case knowing Fedex].

Re:BYOS (4, Funny)

Chrisq (894406) | about a year ago | (#44827631)

[I would use SSDs in a metal padded case knowing Fedex].

Fedex is like UDP, an unreliable delivery service. In fact there is only one fault of UDP it does not duplicate. Things can arrive broken, out of order, delayed, or not at all but I have never heard of Fedex delivering multiple copies!

Re:BYOS (4, Funny)

Anonymous Coward | about a year ago | (#44827683)

I have. Sometimes Amazon messes up. This is how I have a copy of XCOM. :)

Re: BYOS (0)

Anonymous Coward | about a year ago | (#44828177)

I wish I could upvote/like this

Re:BYOS (1)

orkim (238312) | about a year ago | (#44829519)

Your math is off just a little bit there.

60sec * 60min * 24hours = 86400 seconds in day

86400 seconds / 5 seconds = 17280 pictures

17280 pictures * 5 MB = 86400 MB

Or roughly 84.5 GB which should lower the bar even more for that 100Mbps connection.

Re:BYOS (1)

houstonbofh (602064) | about a year ago | (#44826845)

Also, the data center badwidth bills will be very high. I would get a second Internet connection at the home office and keep the server in house. Cheaper all round.

Re:BYOS (0)

Anonymous Coward | about a year ago | (#44827089)

I would process as much as needed in the datacenter (in PA or where bandwidth is not an issue). Then process all you can at the DC. rsync/compression to get data to TN (if that is really needed) else make sure you factor in the real costs of this contract and let them know NOW if your cost estimate is too low. I one time priced out a project before I started it (wiring an office) and in the end made $10/hour after all costs. Dumb.

Do not price your contract too low, grumble for 2 years and then back out. If you are asking here on the forum - you are needing some serious help businesswise. BTW going forward, my wife does my pricing - I only do technical, and know now that I have very little business acumen.

Paging an editor (0, Troll)

evilad (87480) | about a year ago | (#44826459)

Editor to the submission. Any available editor with a decent grasp of English vocabulary and grammar, please respond immediately.

Re:Paging an editor (0)

Anonymous Coward | about a year ago | (#44826509)

Editor to the submission. Any available editor with a decent grasp of English vocabulary and grammar, please respond immediately.

Come on, we know it is mutually exclusive these days!

What? (0)

CaseCrash (1120869) | about a year ago | (#44826751)

Screw the summary. Is this guy really asking us how to do his job?

Re:What? (1)

cjjjer (530715) | about a year ago | (#44829553)

All of these questions like this usually are.

I'll be the one to say it... (4, Insightful)

atari2600a (1892574) | about a year ago | (#44826467)

...WHY are you using BMP in the first place? Does whatever you're generating these on not have the processing capability to compress to PNG before transferring? I mean it SOUNDS like it'd save 10-20% off the total transfer...Anyways, what I'd do is I'd simply plop a server rack at the source that takes all the images for a given hour or whatever, tar.gz.bz2.whatevers them & send them over. Otherwise, I mean, Amazon wouldn't be TERRIBLE?

Re:I'll be the one to say it... (0)

Anonymous Coward | about a year ago | (#44826591)

My thoughts exactly. Can't the files be losslessly compressed in real-time before sending them over, be it BMP->PNG, or 7zip or something?

Re:I'll be the one to say it... (1)

Anonymous Coward | about a year ago | (#44826601)

...WHY are you using BMP in the first place? Does whatever you're generating these on not have the processing capability to compress to PNG before transferring? I mean it SOUNDS like it'd save 10-20% off the total transfer...Anyways, what I'd do is I'd simply plop a server rack at the source that takes all the images for a given hour or whatever, tar.gz.bz2.whatevers them & send them over. Otherwise, I mean, Amazon wouldn't be TERRIBLE?

My (wild) guesses:
* can't ask HD security cameras their customers are using (e.g. 4 "face recog") to waste time with compression
* can't plop a server rack near any (and all) those security cameras, would make them too obvious.

Re:I'll be the one to say it... (4, Interesting)

FishOuttaWater (1163787) | about a year ago | (#44826643)

Yes, your first line of defense is to examine what they need as far as these images, and that will tell you how far you can go in reducing their size for transmission and storage. Can they be scaled down? Can they be lossy? Can you take some time to run a more effective lossless algorithm on them? Is there redundancy between images? Secondly, do you have to move the whole image? Can you do your work on a lower quality image to define the series of steps required and then apply those steps remotely at their location? Just think real hard about what the requirements are, and don't rush yourself. You may come up with your best ideas in the shower on this when you have time to think outside the box.

Re:I'll be the one to say it... (1)

Osgeld (1900440) | about a year ago | (#44826733)

that's pretty much what we do in house, machine generating the image data stores locally and on schedule a server pulls the data, that server gets incrementally backed up offsite when no one is around.

problem with having systems rely on a network to function is you can halt production or loose valuable data (cause that massive failure that got to the customers customer WILL happen while your connection is down)

Re:I'll be the one to say it... (1)

gerf (532474) | about a year ago | (#44826995)

There are compression plug-ins for servers. He'll need a server on site as a buffer in case of a hiccup to the cloud at the very least. But if he's putting in a server, he can just do it all himself anyway.

Sounds like... (3, Interesting)

cultiv8 (1660093) | about a year ago | (#44826471)

the sales guy oversold your capabilities. Instead of asking about cloud options, why don't you just pick a server host with a good reputation (Amazon and Rackspace come to mind) and pass the costs onto the client?

Re:Sounds like... (0)

Anonymous Coward | about a year ago | (#44826545)

Alert: Sig correction needed. "(to) read" and "(have) read" are spelled the same. It's imperative to add "only" between the "I" and the /., asap.

Re:Sounds like... (2)

Dishwasha (125561) | about a year ago | (#44826723)

Or that the business and product owners under-priced the monthly contract with the client.

And what the heck does your internal network have anything to do with the performance of your product? Separate your general business network from your server network if not for performance or HIPPA, but for the day when one of your developers or unpatched machines do something to DoS your business.

Also, you might want to read up on MTU [wikipedia.org] . Large file transfers might be better served with an MTU larger than 1500.

Re:Sounds like... (1)

klubar (591384) | about a year ago | (#44829389)

I have to agree that the host/server/bandwidth costs should be a relatively small factor on your calculation. Reliability, security and responsiveness really should be more important. The difference between top tier and bottom tier hosting/cloud is probably no more than a factor of 2 -- you can easily burn thru that savings with a couple of hours of downtime or a hosting vendor screw up.

If cost is really important, I'd get it working first at a top tier vendor and then overtime try to squeeze out costs--either negotiating a better rate (based on your volume) or switching to a lower cost vendor.

Alternative, why not just buy more bandwidth to your location. The bandwidth costs should be relatively low compared to the overall project costs. Also, this will provide you with office redundancy (at least at some level).

Too often in trying to save money, people focus on the wrong part of the problem.

Amazon DynamoDB (0)

Anonymous Coward | about a year ago | (#44826479)

Just the faqs [amazon.com] .

Snail Mail and a hardrive (4, Informative)

duckgod (2664193) | about a year ago | (#44826499)

Assuming you don't need real time analysis(doesn't look like it from problem description). Send a couple 500gb hard drives and have someone mail you the daily load of images each day with overnight shipping.

Re:Snail Mail and a hardrive (1)

Anonymous Coward | about a year ago | (#44826527)

This is actually a good solution.

I am informed that google does this regularly to save money transmitting large volumes of data.

Re:Snail Mail and a hardrive (2)

Resol (950137) | about a year ago | (#44826753)

Yep, in school (long ago) there was an old adage -- "never underestimate the bandwidth of a semi full of mag tapes". Sure the latency is high, but in many cases, not an issue!

Re:Snail Mail and a hardrive (1)

chrism238 (657741) | about a year ago | (#44827165)

"never underestimate the bandwidth of a semi full of mag tapes".

Agreed! My recollection is that it was from Andrew Tanenbaum, and it was "never underestimate the bandwidth of a stationwagon full of tapes hurtling down the highway". http://en.wikiquote.org/wiki/Andrew_S._Tanenbaum [wikiquote.org]

Re:Snail Mail and a hardrive (1)

drstevep (2498222) | about a year ago | (#44828391)

From my earlier years: Fast and cheap file transfer? A grad student with a mag tape.

Re:Snail Mail and a hardrive (0)

Anonymous Coward | about a year ago | (#44827109)

Came to the comments to say exactly the same. After looking into various off-site backup options for our server, we concluded it was by far cheapest just to purchase an identical unit, transfer locally, and store the backup elsewhere. Maybe a little inelegant, but far more practical.

Re:Snail Mail and a hardrive (1)

cbope (130292) | about a year ago | (#44827223)

Shuttling a couple hard disks back and forth every day of the week using overnight shipping would be a fairly expensive option. You would have to have at minimum 2 sets of disks, sending them both ways every day, the shipping costs alone would be high if you do this on a daily basis. We are talking 2x the daily overnight shipping costs for a 2 pound package, multiplied by an average 21 working days/month. I don't know what the typical costs for overnight shipping in the US these days, but let's say $25 per shipment and $50/day. The monthly shipping costs work out to be $1050. And that does not include all the "manual" labor of copying data to/from the disks, packing, shipping paperwork, etc. The cost of the disks would be fairly trivial in comparison to the shipping costs.

Also, you would likely want a larger pool of disks to spread the failure rate, as all the bumps and shocks they receive every day being shipped back and forth is very likely to result in damage and short lifespan.

I totally agree on this method for one-time or infrequent large transfers, but I think you are creating more problems by trying to use this method for daily transfer of data.

Re:Snail Mail and a hardrive (0)

Anonymous Coward | about a year ago | (#44828273)

OTOH, they only need 144GB/day, and you can easily stuff three 64GB microSD cards in a regular envelope and mail them for $0.45/day each way (round-trip shipping cost: less than $30/mo).

Or if you can afford to batch them for a week, you could probably still fit 16 micro SD cards in a 1oz envelope and mail that for $0.45/week each way (round-trip shipping cost: less than $4/mo). Nevermind that the envelope would contain over $500 worth of SD cards, so you'll be out a small fortune when the post office inevitably loses or destroys the letter.

Re:Snail Mail and a hardrive (1)

hendrikboom (1001110) | about a year ago | (#44830513)

You'd want to be able to save them for a week so you can retransmit if one of them gets lost in the mail.

-- hendrik

Re:Snail Mail and a hardrive (2)

bemymonkey (1244086) | about a year ago | (#44827447)

If a bigger pipe is too expensive, overnight shipping of a hard drive every day is going to be WAY too expensive.

Why don't you? (0)

Anonymous Coward | about a year ago | (#44826513)

why not just get another connection that solely does the data transfers? also get images compressed(zipped) first may reduce bandwith needs?

It's not easy (0)

Anonymous Coward | about a year ago | (#44826515)

My solution will be to use Mac Minis for storage and processing (accessed through ssh/screen - individual user account per server process) and Raspberry Pis (with a distro I'll be calling Dr P Linux) for handling connections. More connections = more pis, more processing/services = more insecure gruntboxes.

Is Amazon S3 an option? (2, Informative)

Anonymous Coward | about a year ago | (#44826519)

Assuming 5MB of data every 5 seconds, you're dealing with ~90GB of data a day. So, looking at Amazon's pricing model (http://aws.amazon.com/s3/pricing/), assuming you delete the data after you pull it, the storage total should be in the range of $0.095 * 90GB = $8.55/mo. Transfers into S3 are free. You'll be transfering ~2.7TB/mo out (90GB*30), at $0.120/GB, that's $324.00/mo in transfer fees.

Now, if that data isn't being accumulated 24/7 (ie. if it's only 8/5 for example), that lowers your monthly fees to the $80 range. Sure, you can shop around for someone who will charge you less for transfers (though if they're not charging at all, they may start complaining at the volume you're transfering data around), but $350/mo in fees to help keep a project that's making you money from killing your network? Would sound doable to me.

Re:Is Amazon S3 an option? (0)

Anonymous Coward | about a year ago | (#44826779)

Dropbox for business is $795/yr [dropbox.com] with no limit on storage or transfer. Maybe there's some hidden caps that they don't publish, but otherwise that seems significantly cheaper than Amazon.

Re:Is Amazon S3 an option? (0)

Anonymous Coward | about a year ago | (#44827021)

Dropbox for Business has a 1000gb cap, with a "call us if you want more space". No clue what they do if you call them.

Re:Is Amazon S3 an option? (2, Insightful)

Anonymous Coward | about a year ago | (#44827285)

Dropbox is just a VAR for Amazon S3 [dropbox.com] , so it couldn't possibly be cheaper. Most people don't know that half of Silicon Valley is running off Amazon AWS.

Re:Is Amazon S3 an option? (1)

DeSigna (522207) | about a year ago | (#44828183)

Alternatively, an $80/mo Linode (or similar) plan would cache 2 days of data (~200GB storage), offer some capacity to 'cook' it a bit before re-downloading (say, do some compression) and have enough transfer (8TB/mo) all in one shiny package. For pure storage, I think Dropbox and similar AWS-hosted services weigh in around the $60/mo mark at what would be needed.

Personally, I would spend money on an additional, dedicated Internet connection or (better) WAN tail to the customer and drop some staging hardware on their network border to ensure outages don't result in lost data.

The words 'budget' and 'cloud' together usually result in a selection of four-letter words, most notably, 'pain'.

Re:Is Amazon S3 an option? (2)

0100010001010011 (652467) | about a year ago | (#44828415)

Bittorrent Sync [bittorrent.com] is exactly what you're looking for.

I just setup this same thing to backup all my photos. I was bouncing between rsync, samba and other random different programs. I wanted something to sync between numerous different computers and off site.

Bittorrent sync solved all of this. It's almost as if they planned for people using it the way I am. In addition to having Mac and Windows clients. They also have

  • Linux ARM
  • Linux PowerPC
  • Linux i386
  • Linux x64
  • Linux PPC QorIQ
  • Linux_i386 (glibc 2.3)
  • Linux_x64 (glibc 2.3)
  • FreeBSD i386
  • FreeBSD X64

You can either set it up from the command line with a JSON config file or through a web interface on headless machines. I have it setup on one of my VPSs with a large disk. All of my family photos are now 'in the cloud'. Backed up off site. I added another VPS just to see what it'd do. It' synced at around 2-3 MB/s between them and a bit from my home connection. (It does use the bittorrent protocol). So now my home photos are on 2 different VPS on two different continents. If I want to give some one access to them I can generate a read only key or a time limited read only key.

One of the coolest features is that I have a webserver where I have people upload family photos. I HAD an rsync cron job set up to sync the photos to my computer every night. Now the upload folder is a BitTorrent Sync folder. Within seconds of someone uploading photos. They get sync'd to my desktop, my laptop, my server, my VPS on another continent.

If you want more redundancy add more servers. The more nodes you add the faster new nodes get 'up to date'.

Any sufficiently advanced technology is indistinguishable from magic.

Now repeat, but for cloud vendors (0)

Anonymous Coward | about a year ago | (#44826549)

We've come to the conclusion that a cloud service would be the best solution for our problem

Presumably you have looked at vendors already, otherwise how would you know capabilities exist in "the cloud"? Keep those cloud conclusions coming, until you find the right vendor.

Wildlife, production run or "other" pics? (1)

AHuxley (892839) | about a year ago | (#44826565)

"Who" is paying for the stream of pics of such quality and via a "very high speed fiber optic network"
eg. If you are counting wildlife, ask the gov/state for more hardware.
Cash might be very tight but gov data storage options should be usable.
Is it OCR on cars? Changes in activity around buildings?
If the "facility" has the need and cash to pay for images to be taken, optical and your work - ask for more cheap, fast storage.
As for the "cloud" and the nature of your work be aware that the US and a few other govs can have a look anytime.
http://www.smh.com.au/technology/technology-news/whistleblower-reveals-australias-spy-agency-has-access-to-internet-codes-20130906-2tand.html [smh.com.au] Best to air gap the 'results' part of your work from the bulk input and keep it all internal.

or maybe (1)

udachny (2454394) | about a year ago | (#44826587)

has your organization by any chance just lost 90% of its administrators? I mean I can imagine that having to store everybody'd pictures, taken by cell phones is quite an endeavor with so much of the workforce gone....

Egnyte (1)

Anonymous Coward | about a year ago | (#44826595)

There's always Egnyte (https://www.egnyte.com/)

They're not very expensive and they offer what they call an "ELC" (enterprise local cloud) or "OLC" (office local cloud). The way it works is you store the files in their datacenter and you can use their elc/olc clients effectively as a caching mechanism that is sync'd with cloud contents. This happens in such a way that anyone in your office/datacenter can access files from a common interface/api without having to saturate your 100meg pipe by fetching the same file multiple times.

Re:Egnyte (1)

odgrim (3065827) | about a year ago | (#44827029)

+to egnyte. Full disclosure, I work there.

Compression? (0)

Anonymous Coward | about a year ago | (#44826599)

Lateral problem-solving here. Assuming you can't vary the BMP requirement, have you considered some sort of WAN-compression? Riverbed is the market leader but there are a host of other alternatives out there now. These sort of boxes use all sorts of magic tricks but the main ones are on-the-fly compression and caching of data. The caching may still be of use if there is repetition in the images - it works at the bitstream level rather than recognising entire files meaning if there is a similar block of data inside the BMP it can still benefit from the caching. The benefit will greatly vary based upon your source data, but I think you can get a unit on eval.

Riverbed in particular is not cheap (seems to be priced just below the cost of a WAN upgrade) but it should improve the situation and can also limit the throughput so you can still utilise the link for other things. The only downside is you need an appliance (physical or virtual) at both sides so you may need to get the other side to play ball.

Google App Engine? (0)

Anonymous Coward | about a year ago | (#44826631)

Very easy to prototype and then you can knock up the ante to see if it scales. With over 4 languages supported, its rather flexible.

Redesign (2)

Hanyu Chuang (3065541) | about a year ago | (#44826645)

That huge bandwidth is a major load requirement of the project. That bandwidth is going to cost you or your client too much money. I think you should simply look into separating the functionality so you can do the analysis on customer site, and you only "get"(pulling from db, webservice, or a rss feed) the analysis results right there on customer's site, and the rest of your application sit where it is now. From the sounds of it the images are first saved somewhere on customer's network, so perhaps it is not much of a stretch to install your analysis app right there?

Spot instances or dedicated hosting package? (0)

Anonymous Coward | about a year ago | (#44826667)

I don't see why a cloud solution is so ideal for your system.
Note that everything from traffic to used diskspace gets charged.
If you have a lot of traffic, you maybe better look at an unmetered dedicated server, especially if you can calculate your needs in advance.
Some hosting companies offer reasonable prices for such packages compared to all additional costs that Amazon will charge you.

Another option is using spot instances from Amazon (I don't know if other providers have them).
It is basically an auction based usage scheme but you have to model your processing for it.
Depending on the type of machine you usually pay 1/3 ~ 1/4 of the normal fees.
If someone offers a better price your virtual server will be assigned to them so you have to take into account that your server can be terminated at any point.
Your external storage will be just disconnected so when you boot up another spot instance your data will still available.
It's a bit tricky so set up properly but it is a very cheap solution for processing a lot of data at low cost.
Of course with Amazon you can combine normal permanent instances (e.g data collector) and permanent database services with multiple spot instances (e.g analysis workers).
The inbound and outbound traffic costs for a spot instance is however the same as a permanent EC2 instance.

why bother? (2)

brad-x (566807) | about a year ago | (#44826673)

if you're going to sync nightly anyway, why bother with a cloud service? just sync at night.

Re:why bother? (0)

Anonymous Coward | about a year ago | (#44826755)

That's a good one :)

OK I have a question too (history related): what color was Henry IV 's white horse?

Re:why bother? (1)

hendrikboom (1001110) | about a year ago | (#44830527)

White, if it existed. Black. otherwise.

Re:why bother? (1)

6of9 (111061) | about a year ago | (#44826997)

+1

According to my calcs, you've got bandwidth to spare to complete the transfer overnight. If for some reason you do need to do this during the day, find a competent network engineer to implement QoS on your network so your VoIP/pr0n/WoW doesn't suffer.

Re:why bother? (1)

AHuxley (892839) | about a year ago | (#44827163)

I get the feeling they have to work on or respond to one frame in near real time? What products produce the number of and that type of file 24/7?

Re:why bother? (1)

SGT CAPSLOCK (2895395) | about a year ago | (#44828253)

My first thought was scameras / licence plate cameras at intersections, etc. I hope it's not something malicious like that!

Re:why bother? (1)

AHuxley (892839) | about a year ago | (#44828345)

Yes my thought too - why is the facility unfriendly for local processing? A prison, super fund site, in the ocean, a blimp, some size/heat issues?
What produces images so fast and a that depth? Computer animation work done extra cheap?
The optical part is ? too. That does not sound best effort average telco loop cheap. The term analysis but at their convenience?

Re:why bother? (0)

Anonymous Coward | about a year ago | (#44829279)

It's probably geological images for fracking or something, with the depth map data.

Are you planning to process the data in the cloud? (1)

antifoidulus (807088) | about a year ago | (#44826689)

You mention rsync etc.... is this really necessary? From your description it sounds like the biggest potential cost for you is going to be network(followed by storage), but depending on where you are using the data the charges can vary wildly. For instance, incoming traffic from the internet TO Amazon is actually free, but outgoing is not free. If you really want to save money you are probably better off actually doing your processing in the cloud as well. Otherwise those bandwidth charges are going to eat you alive.

Colocate your box (0)

Anonymous Coward | about a year ago | (#44826705)

Assuming your don't care about the images themselves, just the result of processing the images, then you can save bandwidth by putting your own server at your customer's site.

Instead of transmitting the images to you to analyse, they would copy the files to your server via the LAN. You would connect to the server from your offices and execute your commands remotely. Again, assuming the results of the analysis are small you can send them back to yourselves, and send a copy locally to the customer.

If you *do* need the actual images in your office then my assumption doesn't hold, so use lossless compression, or find out what level of lossy compression is acceptable. And never underestimate the bandwidth of a truck full of hard drives.

Amazon Web Services (0)

Anonymous Coward | about a year ago | (#44826729)

Amazon Web Services does not charge for data transfer into an AWS instance currently: http://calculator.s3.amazonaws.com/calc5.html

We are using AWS server instances to download and process multiple large files that are in the 10 to 150 GB range each.

If you shut down your analysis instances when they are not being used, you will only get charged for the storage that they consume, not the compute time.

You can further reduce your costs by using a reserved instance to be available 24/7 to receive image files and launch spot instances to run analysis.

Once you are finished with the images, dump them into AWS S3 and then into AWS Glacier for longer term storage.

Don't use the cloud! (1)

dohzer (867770) | about a year ago | (#44826773)

The NSA will steal your photos. Unless your 'vision based company' is doing some shifty security work for the NSA. In that case, you're fine and have a ridiculous budget so this post doesn't apply to you.

QoS (0)

Anonymous Coward | about a year ago | (#44826795)

Buy network equipment that can set sufficient QoS to cap the transfers to ~80% so that your other traffic won't get bugged. I'm guessing that is what "internal network can't handle it" means.

NAS (0)

Anonymous Coward | about a year ago | (#44826893)

Buy a network attached storage device. If you are worried about it, buy two. Keep it on a LAN. Don't attach it to a wan. Store your data redundantly on it. Automated backups are best. Have someone maintain the archives. Keep it off the 'net, keep it away from the NSA. Remember: NAS good, NSA bad.

Google Drive, 100GB, $4.99/mo (1)

ysth (1368415) | about a year ago | (#44826915)

Since you are just spooling, that should be more than adequate.

(Not sure how someone else calculated 432GB/day, and I am horrified by the suggestion to overnight mail hard drives - way too expensive.)

snail mail + local server (0)

Anonymous Coward | about a year ago | (#44826917)

We house about 15TB of video data and growing about 5TB a year at my shop - we also work in CV. I'd say in house server is your best thing for local processing... the cloud's storage costs may eat you alive in time. Get a second internet connection if you don't want to be affected. I suggest snail mailing drives of the large collections. 3TB drives are about 100 bucks right now and sata copies about 120MB/s so the copy jobs don't take too long assuming your server is decent. Towards compression... I imagine anything other than lossless is out.... I can only really think of compression on the BMP easily and for that I'd use tiled adobe deflated (i.e. zip) TIFFs - you'll have to find the sweet spot for tile size. It will cut down the size to about 2/3rds. There's also JPG2000 lossless mode but that will be SLOW to work with and less software supports it than tiled tiffs, but it will have much better compression ratios on the average. You're on your own for depth maps unless they are raster in which case you can probably do the same thing with them. Since most sensors are some odd number of bits and there's spacial locality (tiles exploit this) your BW and storage needs will tend for significant savings but likely no more than half of their original size.

Get another line. (2)

bill_mcgonigle (4333) | about a year ago | (#44826965)

You're probably going to pay less for a second cable modem line than you will to store that much data in the cloud. Cloud processing is fairly cheap - cloud storage is expensive.

And then you won't have to re-tool anything else in your processes, except maybe adding another route or two. If you're doing that much data processing, the $200/mo for the line shouldn't really be a huge expense on the contract.

If you're looking to scale out this service to lots of companies, then the calculus might be different.

Glacier (1)

berchca (414155) | about a year ago | (#44827151)

Amazon has a low-cost version of S3 called Glacier, the downside of which is slow data retrieval time.

Also, on the extremely unlikely chance you're using Apple, there's a solid tool called Arc which will front-end for Glacier, and add encryption and automation to boot.

Get another line (0)

Anonymous Coward | about a year ago | (#44827215)

100Mbit is a standard residential line, why not just pay the extra $50/mo and get another line for getting the image feed.

You say you are getting around 5 Mbytes every 3 seconds, that is 15 Mbit/s. This capacity downstream is easy to get.

You can also do some network wizardry and limit you receiving PC to 20 Mbit/s such as not to kill the 100Mbit/s network. Rate limiting is not a big thing. Then there will be no bursts killing the connection.

Blocking Youtube / Netflix etc might help too.

Professional? (0)

Anonymous Coward | about a year ago | (#44827243)

You're getting paid, right? Do your own fucking job.

you should already have (0)

Anonymous Coward | about a year ago | (#44827399)

your fucking solution already... given that you proposed something and then "signed a project with a very large company"

slashdot is not letothersdoyourwork.com

compress before uploading ? (0)

Anonymous Coward | about a year ago | (#44827413)

Why they can't compress the .bmp files using 7zip or similar high compression s/w before uploading.

Cloud is only going to slow down and complicate this.

Don't bother (2)

Billly Gates (198444) | about a year ago | (#44827539)

It is more expensive than a cloud unless you are really big. Many startups that used to use Amazon's service decided with virtualization it was cheaper to use their own after they needed fiber connections and others to host massive bandwidth for all the boxens on the cloud.

With 1/2 down your speed will be adversely affected. With VSphere is about $7,000 including a CentOS or Windows Server License and Windows Server 2012 with HyperV is the same price. You can host VMs and have data backed up elsewhere for redundancy. Yes this will eat up data and raise costs with your T3, but it will consume less data than clouding everything.

Repeat the cloud does not save you money with all the hidden costs.

Re:Don't bother (1)

Chrisq (894406) | about a year ago | (#44827639)

It is more expensive than a cloud unless you are really big.

Or indeed really small, so you would only need a fraction of a server.

Re:Don't bother (1)

Billly Gates (198444) | about a year ago | (#44827665)

You can get a Lacie server/Network Appliance for $599.

It includes a 5 disk raid and Linux running Samba or if you want $1099 for one with Windows Server Small Business Edition and some SSDs for a few hundred more after that if you need Windows specific stuff and more performance and ram.

Fast 140 megs a second system that can serve as a print server as well. I see big enterprises use them to for small branch offices with 50 people where only a T1 is on the wan. These save network bandwidth too as they cache copies of files of popular shares across the WAN and sync them gradually too!

You do not need a nice $50,000 rack unless you are medium sized. They are very easy to setup and people forget appliances can be teh way to go for many. With $300 a month for a cloud service + $200 for network bandwidth you can meet the ROI within 1 month!

5mbytes every 3seconds is only 13.333 mbits/s. (4, Informative)

millisa (151093) | about a year ago | (#44827713)

we're on a business class 100mb cable connection
100mbps = 12mbyte/s (give up 15-20% for the packet overhead, 10megabytes/sec).

Distilling that summary into the data that mattered:
1.5mb image, 3mb file each under 5 megs.
and
images every 3-5 seconds

The files are 5megabytes total.
In a perfect world, they'd transfer in 0.5 seconds.

Leaving 2.5 - 4.5 seconds for the porn.

Let's assume they are the bigger size, 5megabytes, and they transfer in the more frequent number, every 3 seconds.
5MBytes/3s = 1.66667 Mbytes/s = 13.33333 mbits/s.

Why is a facility with a 100mb/s line incapable of handling this?
How did a problem where a 100mb/s line can't handle 13.3333mb/s come to a conclusion of "Fix it with the cloud?"

In any case, if you want to do a cloud setup, just about all of them will handle small 13.3mb/s constant rates and you'll pay for it more than if you figured out why your line isn't keeping up.

14Mbps - Problem? (0)

Anonymous Coward | about a year ago | (#44828947)

I concur, the transfer rate is in the range of 14Mbps and the 100Mbps circuit should be able to very easily handle it. This person either doesn't have a clue what they are doing or talking about, or bandwidth is a red herring and not the real issue.

144GB of storage per day (5MB/3sec.) might be the issue. At that rate, he'd need a new ~3TB drive every month. This also doesn't seem like a major problem.

It's really looking like this guy doesn't have a clue!

Re:5mbytes every 3seconds is only 13.333 mbits/s. (0)

Anonymous Coward | about a year ago | (#44829019)

The problem with these calculations is that the cable company likely doesn't have a SLA with 100mbps speed guaranteed at all times, and it is instead a "typical" or "maximum" speed.

You can't be serious (0)

Anonymous Coward | about a year ago | (#44827873)

If you can't deliver the service, you should not have signed the contract. Unless you can think up some hack quickly, the only correct solution here is to go out of business as a direct result of your gross incompetence.

Expen (1)

mysidia (191772) | about a year ago | (#44827955)

Storing 140 gigabytes a day is going to be expensive with any cloud service; you will essentially be using 4 Terabytes per month in bandwidth; as well as a lot of disk storage --- cloud providers charge dearly for this.

You might be better off getting your local network's connection upgraded. Obviously; this has benefits beyond merely offloading storage.

We're now thinking the customer's workstation will sync the data with the cloud, and we can automate pulling the data during off hours so we won't encounter congestion for analysis.

If you are pulling data off-hours anyways; perhaps the best thing to do would be to have a local server as close as possible to the point that data is being gathered, receive the data, and handle sending the data to you --- E.g. "collect the data on-site" right by the place it's being gathered.

This should help with the end-to-end congestion thing.

Re:Expen (1)

loosescrews (1916996) | about a year ago | (#44828399)

cloud providers charge dearly for this.

Not necessarily. FranTech's BuyVM [frantech.ca] will rent you a KVM VM with 500GB of space and 5TB of bandwidth for only $15/month. If you need more, you can step up to their 1TB of space with 10TB of bandwidth plan that costs just $30/month. These plans are listed under Las Vegas - KVM Storage in the order form. It looks like they are out of stock right now (you can quickly check their stock here [doesbuyvmhavestock.com] ,) but I think they restock all of their plans on Mondays. BuyVM has a pretty good reputation and I have been enjoying their service for quite a while.

Re:Expen (0)

Anonymous Coward | about a year ago | (#44828643)

If you're going down that route, why don't you point people to the source instead of your bullshit affiliate link?

http://lowendbox.com

Install a dedicated link (1)

msobkow (48369) | about a year ago | (#44828157)

I'd install a dedicated link and just add the cost of the link to the project's expense list.

Sooner or later you're downloading the data, and most customers I've dealt with would have an issue with spooling their data to the cloud in the first place -- it's why they would have contracted a small firm to do the processing in the first place.

Let's face it -- network capacity is just not that expensive nowadays, especially seeing as you sound like you're primarily interested in download speed, which means you can opt for asynchronous solutions that have greater download capacity than upload (which are usually cheaper.)

And for God's sake -- compress your data!!!

Re:Install a dedicated link (0)

Anonymous Coward | about a year ago | (#44829583)

Yep, rsync -z and a second business class connection would be my first thought.

"Cloud" is the new buzzword (0)

Anonymous Coward | about a year ago | (#44828305)

Just curious - why do you think you need anything with "cloud" in the title?

A simple rsync cron job will take care of everything. Build a local server cluster and co-locate in a local datacenter. Far, far, far less expensive than something with "cloud" in the title and you'll have total control over everything. Expect this solution to pay for itself in about 6 - 10 months versus going cloud.

You mention that your 100 megabit connection isn't fast enough, so you're thinking of going to the cloud. Going to the cloud won't improve your local Internet connection.

11 devs + 1 sysadm (1)

csumpi (2258986) | about a year ago | (#44828445)

If 11 developers and 1 sysadmin can't figure this one out, you should be all fired.

The cloud is nearly never a good idea (1)

silas_moeckel (234313) | about a year ago | (#44828527)

Hire a network consultant to fix your broken internet. After that's done have them figure out how you guys can scale. It's probably not a great idea to have to send all this stuff to you office. I am assuming your using GPU's those can be rented and/or bought. You probably want a system that can be distributed fairly well.

The cloud is a buzzword not a product. A coloed 1ru server can hold about 40TB of bulk storage, Most colos will lets you use nearly unlimited inbound traffic (normal ratio is 1 to 10 inbound to outbound and they pay for the higher of the two) so it's effectively a free resource. Past that whatever you need to process that data can be shifted into colo. Two or more sites in the long term with cross failover and load balancing is probably you best long term position.

Re:The cloud is nearly never a good idea (1)

funwithBSD (245349) | about a year ago | (#44829071)

Except it is not the internet:

"Our facility is incapable of handling such large transfers without effecting internal network performance."

Which should be even easier to fix. He is trying to fix the wrong problem.

Re:The cloud is nearly never a good idea (1)

silas_moeckel (234313) | about a year ago | (#44829701)

Lol if 13mbs is affecting lan performance you have serious issues. Still using a 10mbs hub?

Best Solution is NOT Cloud, It's Chattanooga (0)

Anonymous Coward | about a year ago | (#44828545)

Your company is small seems to be growing. Move it to Chattanooga TN and grab that Gigabit Fiber then you can just do transfers as the photos are taken/processed. It's not so radical. No different than startups moving to California in search of money. It's out of the box but the best solution. Plus once there you can sell that service all over the country because you are on a gig fiber network.

One option is an image processing co. (1)

OverZealous.com (721745) | about a year ago | (#44829247)

There are several companies out there who do nothing but handle image processing "in the cloud". They could be used as simple bulk file transfers, or they might help solve the real problem â" dealing with large, uncompressed images.

I know of two off the top of my head:

  • Cloudinary [cloudinary.com] : They will handle everything for you, including the storage and file conversions on the fly or via a pre-defined script.
  • Transloadit [transloadit.com] : They don't handle the storage, but instead interact with an Amazon S3 bucket you provide. Image conversion requires creating a "script" in the web interface, but otherwise they are fairly similar.

In either case, your clients can upload the files directly to their servers, and the 3rd party company can begin converting immediately (if you choose).

not that hard (0)

Anonymous Coward | about a year ago | (#44829631)

assuming fastest snapshot every three seconds....

86400 s in a day / 3 sec snapshot * 5 MB per image = ~ 144 GB per day

A batch download at night will take 3 hours, manageable assuming you must stay with .bmp

Switch to jpg and play with the compression to suit your needs and you can easily slash that size by 10 or more.

Are you management? (0)

Anonymous Coward | about a year ago | (#44829669)

We have a problem with our %dailyActivity%, can the %popularBuzzword% fix it?

this is just way too simple (1)

sribe (304414) | about a year ago | (#44829797)

So 12mb/s (max) of transfers will bog down your 100mb/s connection so badly that you just cannot do it??? Uhm, are you sure about that???

Well, OK then. Get another one.

Amazon S3 (0)

Anonymous Coward | about a year ago | (#44830209)

9.5 cents / GB Month, plus a couple of cents per 1000 requests to the system. Seem pretty reasonable to me.

http://aws.amazon.com/s3/pricing/
http://www.sitepoint.com/5-useful-amazon-s3-backup-tools/

Please to be doing my job kind sirs (0)

Anonymous Coward | about a year ago | (#44830365)

I am in looking for the server cloud. Please to be kindly corresponding with solutions for my project. I am most anxiously and Ranjali is most missing me from the home time.

Please do the needful.

Blob & Queue (0)

Anonymous Coward | about a year ago | (#44830519)

I would recommend setting up a Blob and Queue on a cloud provider like Azure. If you stored all the images on a Blob, while putting information in a queue that they are there to be processed, you could enable a buffer between your client and your office.

At some point you will have to ensure that you are keeping up with the amount of information that they deliver.

If you are interested in talking about this shoot me an email: jef@agilebusinesscloud.com

Just get another cable connection or DSL... (0)

Anonymous Coward | about a year ago | (#44830841)

Why not opt for a secondary internet connection that is only used for client data transfer? Would be way cheaper in the long run, especially if you have your own computing resources to use for the work once you have the data.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?