Hello, i'm totally new here and if my name gives you any clue (once burned, twice shy) i'm somebody who learned the HARD WAY. Now i'm trying to learn how to do things the right way.
To keep this from getting too huge (i'll happily fill anyone in on the boring details if you ask) i'll just jump right in though.
Quick facts:
- I am trying to learn what is necessary to build a series (at least two steps anyways) of NAS/SAN type devices to start at the small end of a shoestring budget (I am a back to college student deep in debt and I expect things to be tight for years) which will do the job until I have the budget and can hire some professional to take care of it
- The purpose of this is Film Projects and Video Game Design stuff. As you may well be aware, data has grown to ginormous proportions. Star Wars 7 for instance has over 1 PETABYTE of work data for it alone. 300TB is a figure I pulled out of my rump but it's not unrealistic, certainly not for 'later' when shooting 4k/8k as opposed to 'now'/that level is not going to be needed for a few years. This is more about growing into it. It may even have to get larger.
- The budget is not zero, but it is not "fully adequate". It's more like there are always higher priority claims on available income or new loans which preferably get spent on things like camera equipment or lens rental.
- The people working on this are close to "nonprofit" status. What I mean by that is everyone is volunteering time in a creative collective and we will be trying to pull each other up by each others bootstraps instead of ourselves. Create youtube shorts, indie games, and similar until something takes off. Once it does, try to kickstart everyone involved in the collective into productive taxpaying members of society no longer living in their parents basements and eating Ramen. I guess this is worth explaining because i'm sure it will be brought up how underfunded everything seems - it's just how it is. The point is that we can't afford to hire proper IT professionals when we cant even afford proper equipment and location shooting permits either.
- Therefore, I have to learn how to do this myself. If I cant learn, it doesn't get done. After five years of waiting for someone else to volunteer to be the IT Professional, it Just Hasn't Happened. So i'm taking the initiative a second time and trying to brush off the dust, treat the bruises, and try again. Nobody else in our group has had any kind of breakthrough of cash input or meeting an IT guy who wants to put in this kind of time either I mean, it's just artists struggling to help one another and right now lacking proper data storage is holding everyone and everything back. Yet I seem to have the best (still inadequate) computer knowledge for setting this up so i'm volunteering.
- I already tried to do this and it's where my name came from. I've got a box of like two dozen hard drives either half dead or in the process of dying of lost data we hope to recover someday from the last attempt where the best I knew how to do ended up in a hopeless flustercluck.
- As soon as the money and income becomes available to migrate everything into "everything done right" that will be done immediately. There just has to be data around to migrate/it has to get that far. The problem is that may well take 10-15 years and I don't want to repeat the optimistic judgements that led to total catastrophic data loss last time.
- Alot of this is a research project that's going to be "mapping the future". Some implementations are going to be impossible on X budget - that's okay - but after figuring out clever ways to do more with less i'm hoping to know where the lowest possible entry point is so that i'm able to jump when it is feasible. In some cases the only answer will be "you wont be able to upgrade to a better system unless prices drop or more money comes in" and as long as I know i've lowered that figure as much as humanly possible, that's okay. It's more like if I haven't lowered that figure sometimes i'll literally be taking food out of someone else's mouth who sacrificed to donate money for hardware used by everyone else in the group, so please understand if I get a little religious about saving money. Saving $50 on a computer case even when $1000 for drives has to be spent is the difference of one of my actors eating well vs Mac'n'cheese all month again.
- I don't expect full answers even in a year, I expect this to be a slow learning progress that expands as I go. That's okay because everyone previously involved is stuck with their own fulltime jobs for awhile and it's going to take time to roll back to where we're ready to start filling drives with footage needing editing and such anyways.
Wow, that was longer than I thought, but hopefully it brings you up to speed.
At this point I still feel I am in the condition of "I don't know what I do not know" so i'm asking for suggestions on how to bring myself up to speed. This can be books, wikipedia links, or just terms, but please understand my goal is not to become a fulltime IT professional either. I'm trying to learn enough to do the job right, but the only job I have to do right is my own - I don't need to learn how to manage other people's servers, or familiarize myself with any old or cutting edge tech i'm not currently using. I'm not trying to replace a normal multiyear education in the computer field.
My sole goal is getting up a highly reliable and expandable storage system on the lowest reasonable budget we can struggle through, so that everyone can finally get back to work, but not have to worry about LOST work from cutting the kind of corner that should never be cut. There's a difference between a money saving strategy (not spending $400 on some nice 16 bay hot swap hard drive rack, and just stuffing 8 drives apiece into two $30 midsize NZXT towers, or even two $5 used thrift store mid towers) and something fundamentally undermining what you are doing. (stretching backup cycles to every two weeks instead of every day, buying "used" hard drives, using dodgy chinese PSU's)
Examples of things i've read about or been interested in while pursuing this project:
- ZFS as a file system (really seems like an ultimate, as silent data corruption turned out a BIG problem in the tens of TB I had stored that I only learned of later, the only problem is either nobody builds really big systems or they cost astronomical over some 'sweet spot')
- FreeNAS (or some other off the shelf "low maintenance" not too demanding NAS system)
- used FibreChannel HBA's ie esp the 4gig speed
- used Infiniband HBA's ie esp the 10gig speed
- last generation tape like LTO6, which is cheaper than even modern cheap hard drives, wont be murdered by a drop to the floor like a box of hard drives already was, easier to mail through the post and have it survive, and designed for 15-30 year archival lives since if this project hasn't turned profitable in 15 years i'm pretty sure it never will but i'd like to plan 15 years as a worst case retention time for data.
- SAN setups, especially as a way to possibly hook up a more modular form of storage block designed around 8-10 drive ATX cases and consumer motherboards instead of huge expensive RAID cards and mobos beyond the consumer max of available RAM to run properly (the way ZFS under FreeNAS needs to, which seems to have seriously bottlenecked larger systems so far since nobody builds them)
There are multiple types of server that would have to be involved and multiple boxes to be set up which would change over time. Ie a higher performance setup for film editing backed by a lower performance backup system only turned on once per day for daily backups which exports most data to LTO6 tape until we have the local HD space to have it in working memory. Separate expansions of LTO6, HD, and SSD capacity based on what is being done by whom and where the bottleneck is. (lots of video footage shooting might all go right to LTO6 until we have the money to upgrade to SSD's to better work on it for instance)
Whats most important is OVERALL STRATEGIES and ASKING THE RIGHT QUESTIONS which I don't even know yet, which is why I don't have that much in my head decided about what i'm going to do yet. As I said "dont yet know what I don't know" and open to ideas, suggestions, references, and sample builds others have done to get inspired or get ideas. I realize if I can better define the problem, people are better empowered to help me design a solution, but there are many concepts to me that are totally new ways of thinking before I read them. (ie trying to wrap my head around all the ZFS concepts for instance, I was used to just saving files to a fixed size drive, the ideas of storage pools and snapshots I could barely parse at first)
I'll stop talking so that some people can read and make a few starting comments before expanding everything else. This already is a pretty big post to read i'm aware. :^)
To keep this from getting too huge (i'll happily fill anyone in on the boring details if you ask) i'll just jump right in though.
Quick facts:
- I am trying to learn what is necessary to build a series (at least two steps anyways) of NAS/SAN type devices to start at the small end of a shoestring budget (I am a back to college student deep in debt and I expect things to be tight for years) which will do the job until I have the budget and can hire some professional to take care of it
- The purpose of this is Film Projects and Video Game Design stuff. As you may well be aware, data has grown to ginormous proportions. Star Wars 7 for instance has over 1 PETABYTE of work data for it alone. 300TB is a figure I pulled out of my rump but it's not unrealistic, certainly not for 'later' when shooting 4k/8k as opposed to 'now'/that level is not going to be needed for a few years. This is more about growing into it. It may even have to get larger.
- The budget is not zero, but it is not "fully adequate". It's more like there are always higher priority claims on available income or new loans which preferably get spent on things like camera equipment or lens rental.
- The people working on this are close to "nonprofit" status. What I mean by that is everyone is volunteering time in a creative collective and we will be trying to pull each other up by each others bootstraps instead of ourselves. Create youtube shorts, indie games, and similar until something takes off. Once it does, try to kickstart everyone involved in the collective into productive taxpaying members of society no longer living in their parents basements and eating Ramen. I guess this is worth explaining because i'm sure it will be brought up how underfunded everything seems - it's just how it is. The point is that we can't afford to hire proper IT professionals when we cant even afford proper equipment and location shooting permits either.
- Therefore, I have to learn how to do this myself. If I cant learn, it doesn't get done. After five years of waiting for someone else to volunteer to be the IT Professional, it Just Hasn't Happened. So i'm taking the initiative a second time and trying to brush off the dust, treat the bruises, and try again. Nobody else in our group has had any kind of breakthrough of cash input or meeting an IT guy who wants to put in this kind of time either I mean, it's just artists struggling to help one another and right now lacking proper data storage is holding everyone and everything back. Yet I seem to have the best (still inadequate) computer knowledge for setting this up so i'm volunteering.
- I already tried to do this and it's where my name came from. I've got a box of like two dozen hard drives either half dead or in the process of dying of lost data we hope to recover someday from the last attempt where the best I knew how to do ended up in a hopeless flustercluck.
- As soon as the money and income becomes available to migrate everything into "everything done right" that will be done immediately. There just has to be data around to migrate/it has to get that far. The problem is that may well take 10-15 years and I don't want to repeat the optimistic judgements that led to total catastrophic data loss last time.
- Alot of this is a research project that's going to be "mapping the future". Some implementations are going to be impossible on X budget - that's okay - but after figuring out clever ways to do more with less i'm hoping to know where the lowest possible entry point is so that i'm able to jump when it is feasible. In some cases the only answer will be "you wont be able to upgrade to a better system unless prices drop or more money comes in" and as long as I know i've lowered that figure as much as humanly possible, that's okay. It's more like if I haven't lowered that figure sometimes i'll literally be taking food out of someone else's mouth who sacrificed to donate money for hardware used by everyone else in the group, so please understand if I get a little religious about saving money. Saving $50 on a computer case even when $1000 for drives has to be spent is the difference of one of my actors eating well vs Mac'n'cheese all month again.
- I don't expect full answers even in a year, I expect this to be a slow learning progress that expands as I go. That's okay because everyone previously involved is stuck with their own fulltime jobs for awhile and it's going to take time to roll back to where we're ready to start filling drives with footage needing editing and such anyways.
Wow, that was longer than I thought, but hopefully it brings you up to speed.
At this point I still feel I am in the condition of "I don't know what I do not know" so i'm asking for suggestions on how to bring myself up to speed. This can be books, wikipedia links, or just terms, but please understand my goal is not to become a fulltime IT professional either. I'm trying to learn enough to do the job right, but the only job I have to do right is my own - I don't need to learn how to manage other people's servers, or familiarize myself with any old or cutting edge tech i'm not currently using. I'm not trying to replace a normal multiyear education in the computer field.
My sole goal is getting up a highly reliable and expandable storage system on the lowest reasonable budget we can struggle through, so that everyone can finally get back to work, but not have to worry about LOST work from cutting the kind of corner that should never be cut. There's a difference between a money saving strategy (not spending $400 on some nice 16 bay hot swap hard drive rack, and just stuffing 8 drives apiece into two $30 midsize NZXT towers, or even two $5 used thrift store mid towers) and something fundamentally undermining what you are doing. (stretching backup cycles to every two weeks instead of every day, buying "used" hard drives, using dodgy chinese PSU's)
Examples of things i've read about or been interested in while pursuing this project:
- ZFS as a file system (really seems like an ultimate, as silent data corruption turned out a BIG problem in the tens of TB I had stored that I only learned of later, the only problem is either nobody builds really big systems or they cost astronomical over some 'sweet spot')
- FreeNAS (or some other off the shelf "low maintenance" not too demanding NAS system)
- used FibreChannel HBA's ie esp the 4gig speed
- used Infiniband HBA's ie esp the 10gig speed
- last generation tape like LTO6, which is cheaper than even modern cheap hard drives, wont be murdered by a drop to the floor like a box of hard drives already was, easier to mail through the post and have it survive, and designed for 15-30 year archival lives since if this project hasn't turned profitable in 15 years i'm pretty sure it never will but i'd like to plan 15 years as a worst case retention time for data.
- SAN setups, especially as a way to possibly hook up a more modular form of storage block designed around 8-10 drive ATX cases and consumer motherboards instead of huge expensive RAID cards and mobos beyond the consumer max of available RAM to run properly (the way ZFS under FreeNAS needs to, which seems to have seriously bottlenecked larger systems so far since nobody builds them)
There are multiple types of server that would have to be involved and multiple boxes to be set up which would change over time. Ie a higher performance setup for film editing backed by a lower performance backup system only turned on once per day for daily backups which exports most data to LTO6 tape until we have the local HD space to have it in working memory. Separate expansions of LTO6, HD, and SSD capacity based on what is being done by whom and where the bottleneck is. (lots of video footage shooting might all go right to LTO6 until we have the money to upgrade to SSD's to better work on it for instance)
Whats most important is OVERALL STRATEGIES and ASKING THE RIGHT QUESTIONS which I don't even know yet, which is why I don't have that much in my head decided about what i'm going to do yet. As I said "dont yet know what I don't know" and open to ideas, suggestions, references, and sample builds others have done to get inspired or get ideas. I realize if I can better define the problem, people are better empowered to help me design a solution, but there are many concepts to me that are totally new ways of thinking before I read them. (ie trying to wrap my head around all the ZFS concepts for instance, I was used to just saving files to a fixed size drive, the ideas of storage pools and snapshots I could barely parse at first)
I'll stop talking so that some people can read and make a few starting comments before expanding everything else. This already is a pretty big post to read i'm aware. :^)