A bridge between your data, and Infinite Storage

General JS3tream How To


How to create and use a key.txt file

It is possible to provide your Amazon S3 ID and Key on the command line with the -k and -s options. But this is not a good idea. On Unix systems, people will be able to few the running command line with the "ps" command, and see both your ID and key. Not very secure. By using the -K key.txt option, you can make things a lot more secure. Now, looking at the "ps" output, you'll only see the reference to a file. You can now set the file permissions of your key.txt file such that only people that should be able to read the file will have access.

A key.txt file is a very simple 2 line properties file. One line for each the key and secret. Note, the following is a completely made up key/secret pair.

  key=F83NAF83HFHAP983F3BH
  secret=haf98g87ftad7agfg3g7aa7dfgoao78asgd8fa7sd

You can now provide this filename to js3tream with the -K option.


How to list my archived buckets

JS3tream can list the buckets you have in your S3 account. This list is not restricted to only archives you have uploaded using JS3tream. This will list ALL of your S3 buckets, regardless of how they were created

  java -jar js3tream.jar --keyfile key.txt --list

The resulting output will look something like this

2007-01-04 04:11:40 - MyArchiveBucket
2007-11-26 09:33:02 - ABucketOfRandomStuff

How to list the files archived within a single bucket

Ok, so you know what S3 bucket you want to look into. Now, we want to see what archives we have saved to this bucket. Unlike the command that lists all of our buckets, this command look inside an S3 bucket for data that has been placed there by JS3tream. If your bucket contains data not put there by JS3tream, I'm afraid I can't guarantee what this output will do. So.. It's going to be in your best interest to put archives into a bucket designed to hold jS3tream archives.

  java -jar js3tream.jar --keyfile key.txt --list --bucket MyArchiveBucket

The resulting output will resemble this

2007-12-01 01:03:45 - MyArchiveBucket:mymusic.tgz - 20.31M in 1 data blocks
2007-11-15 01:03:46 - MyArchiveBucket:moremusic.zip - 19.67M in 1 data blocks
2007-12-01 01:00:30 - MyArchiveBucket:myresume.doc - 19.05k in 1 data blocks
2007-03-05 11:35:16 - MyArchiveBucket:home-movies.zip - 6667.98M in 267 data bloc
2007-03-06 09:36:17 - MyArchiveBucket:2004-taxes.pdf - 6405.70k in 1 data blocks
2007-03-06 09:41:01 - MyArchiveBucket:2005-taxes.pdf - 3159.58k in 1 data blocks
2007-03-06 09:44:19 - MyArchiveBucket:2006-taxes.pdf - 2585.29k in 1 data blocks

The output shows a lot of info about each archive within the bucket.


How to list the data chunks within a single archive

Ok.. We know how to list our S3 buckets, and how to list the archives within a bucket. But, what if we want to see exactly what data blocks make up a specific archive. When JS3tream sends data to S3, it sends it in manageable chunks. If you want to send 100M of data, and with the -z option tell JS3tream to use 10M chunks, then your archive will be made up of 100/10 data chunks.

This is not something that most people will want to do, sine the resulting output is really just useful for developers. All you need to do is identify the specific archive to inspect, and then provide the "--debug" option.

  java -jar js3tream.jar --keyfile key.txt --list --bucket MyArchiveBucket:home-movies.zip --debug

The resulting output will resemble this

2007-04-01 03:03:57 - home-movies.zip:00000000000000000000 - 25000K
2007-04-01 03:21:11 - home-movies.zip:00000000000000000001 - 25000K
2007-04-01 03:36:59 - home-movies.zip:00000000000000000002 - 25000K
2007-04-01 03:53:31 - home-movies.zip:00000000000000000003 - 25000K
2007-04-01 04:32:15 - home-movies.zip:00000000000000000005 - 25000K
...
2007-04-01 04:48:37 - home-movies.zip:00000000000000000264 - 25000K
2007-04-01 05:04:10 - home-movies.zip:00000000000000000265 - 25000K
2007-04-01 05:20:11 - home-movies.zip:00000000000000000266 - 5324K
2007-04-01 05:20:11 - MyArchiveBucket:home-movies.zip - 6667.98M in 267 data blocks

Don't be alarmed that your archive claims to have 267 data blocks, but the last data block has an index code of 000000000000266. This is because the data blocks start with 0, not 1.


How to delete an archive

We've backed up a new copy of our music archive. Now, it time to delete the old one because we don't want to keep 2 copies. Or perhaps you do? That is your call. But, in order to delete an archive, we use the -d option. Note, it is critical that you provide the prefix name of the archive with the ":" in the bucket name. Without the prefix, JS3tream will attempt to delete the bucket itself. But don't worry, JS3tream can't delete a non-empty bucket.

 java -jar js3tream.jar --keyfile key.txt --delete --bucket MyArchiveBucket:mymusic.tgz

Before going through with the delete, JS3tream will give you 10 seconds to cancel the command.


How to delete an entire bucket

Before you can delete an entire bucket, you must first delete ALL archives within it. If a bucket is not empty, then JS3tream will show an error message stating that the bucket must be empty first. Once you have removed all of the archives within a bucket, you can delete that bucket. Note the difference in this command from the command used to delete an archive. It is missing the ":" and prefix in the bucket name.

 java -jar js3tream.jar --keyfile key.txt --bucket MyArchiveBucket --delete

How to specify the data chunk size

The default for JS3tream is to send data to S3 in 5M chunks. For most people, this default will be just fine. Afterall, the number of data chunks a single bucket can hold is rather huge. Infact it's 263-1. Thats a lot of data chunks. Multiply that by 5M and we have a number too big for me to calculate in my head. But.. Lets just say you want to specify the size of the data chunks. This is where the -z option comes into play. simply provide a number "in bytes" you want the chunks to be.

To specify a 25M data chunk size, use the following command.

  type file.tgz | java -jar js3tream -K keyfile -i -b MyArchiveBucket:file.tgz -z 25000000

There is a catch though. JS3tream buffers your data in memory before it sends a data chunk to S3. This means that your process is going to need enough Java Heap to store the buffer. If you get an out of memory error from this, then see the FAQ


How to use buffer temp files instead of memory buffers

So, you tried the -z option, and got an OutofMemory error. Then you read the FAQ and though you'd try to use temp files instead of memory buffers. No problem, simply use the -f option. This tells JS3tream to buffer your data to a temp file, instead of in memory. NOTE however, this option slows jS3tream down quite a bit. temp files are no where near as fast as memory.


How to use the "neverdie" option

If your like me, you might have setup a cron job to backup your data on a regular basis. The "neverdie" option is going to save you some hassles. JS3tream will automatically attempt to re-send a data chunk that fails to upload. But, by default, JS3tream will give up trying to send the data after a specific number of tries. This is a major problem for very large archives. What if you had gotten to the 184th data chunk, of about 210 data chunks, and JS3tream fails for a simple temporary network problem. Now you have to start the upload at the start again! No good.

The "neverdie" option tells JS3tream to keep on trying to send your data FOREVER! If a data chunk upload fails, JS3tream will keep on trying until either you stop it, or the upload finishes. Now, temporary internet problems will not mess up your archive process.