Angry computer nerd rant (for your potential amusement)

Forum organization and occasional community-building.
Forum rules
Questions about Ren'Py should go in the Ren'Py Questions and Announcements forum.
Post Reply
Message
Author
SvenTheViking
Regular
Posts: 65
Joined: Thu Dec 29, 2011 4:19 am
Projects: Chronicles of the Timetraveler's Wars
Organization: ProgMan Productions; Polymorphic Games
Location: Oregon
Contact:

Angry computer nerd rant (for your potential amusement)

#1 Post by SvenTheViking » Wed Jan 04, 2012 4:25 am

I gotta get this off my chest, and I know there will be at least a few people here who will be able to commiserate with at least part of this.

I'm migrating a website from one server to another. If you've never done that, imagine moving a ton of files from the computer of someone you've never met to the computer of a friend, having to keep everything exactly how it is, and then get everything to work right after the files are copied. It's a crude explanation, but it's fairly accurate.

That's more or less my situation. I'm tasked (willingly, I might add, which just makes this worse, as it's all my fault) with moving a website from its old home to its new home, as the old home is being abandoned because the guy who used to run it no longer wants to deal with the headache that it is. This is entirely understandable; as a guy who has run websites, I can relate. I found out that I could relate even more after I looked at the design the guy had implemented.

To put it simply, it's a mess. It's a God awful, unholy nightmare of a mess. First off, the source code. The website's done in PHP, while the guy was learning not only the language, but any sort of programming at all. There are two or three iterations of the website mashed in together, and I can hardly make heads or tails of it. It's mildly annoying, but that's not the worst part.

No, the worst is the directory structure. You see, the website was created to host images. THOUSANDS of them. HUNDREDS OF THOUSANDS of them. Screencaps taken every five seconds from all 90+ episodes of a half-hour long TV show. It's taken me THREE DAYS of digging around in the directories to make some sense of the so-called organization.

There are four directories with images in them, which I have just today found to be in two pairs. The first pair is fairly obvious: "caps" and "cap_thumbs". As one might guess, "caps" contains screencaps, and "cap_thumbs" contains thumbnail images of those caps. Thing is, this directory is formatted such that it's just a directory full of 2000+ images named incrementally (1.jpg, 2.jpg, 3.jpg, etc.), so they're completely unlabeled. Where did these images come from? Why are they here? What episodes are these from? Do you have a functioning brain or did you just bash on a keyboard until something happened?

The second pair is "images" and "photogallery". "images" contains the full-size images, and "photogallery" contains thumbnails. I assume this is because, at one point, they had planned on having a photogallery going, but had never finished it. The structure of these directories is such that they contain 182 sub-directories, most corresponding to an episode of the show.

"But wait," you say. "You said that there are only 90-some episodes. What's going on?"

Well, apparently, several of the directories, each containing over 2000 images, are duplicates. While this may be bad, there is another problem which makes it worse.

You see, the website is almost 10 years old now. And through much of that time, people have been linking to images on the site, which is the entire purpose of the image collection. So, as it stands, I have to find a way of organizing 16 (SIXTEEN!!) gigabytes worth of JPEG images, many of which are duplicates, without changing their names or locations.

If I had a desk, my head would have hit it so many times that I would have broken through the wood, metal, laminate, and anything else, to the point where it would be little more than a collection of tiny scraps.

And the best part? I've been laughing through the entire thing. I don't know why, but I find this absolutely hilarious.
"Be not afraid of greatness: some men are born great, some achieve greatness, and some have greatness thrust upon them."
William Shakespeare, "Twelfth Night"

User avatar
PyTom
Ren'Py Creator
Posts: 15893
Joined: Mon Feb 02, 2004 10:58 am
Completed: Moonlight Walks
Projects: Ren'Py
IRC Nick: renpytom
Github: renpytom
itch: renpytom
Location: Kings Park, NY
Contact:

Re: Angry computer nerd rant (for your potential amusement)

#2 Post by PyTom » Wed Jan 04, 2012 10:40 am

SvenTheViking wrote:It's a God awful, unholy nightmare of a mess. ... The website's done in PHP
Sir... I think you repeat yourself.

More seriously, wouldn't the solution to this be to build a database to store the old and new location of every image?
Supporting creators since 2004
(When was the last time you backed up your game?)
"Do good work." - Virgil Ivan "Gus" Grissom
"Silly and fun things are important." - Elon Musk
Software > Drama • https://www.patreon.com/renpytom

User avatar
Sapphi
Eileen-Class Veteran
Posts: 1685
Joined: Fri Jun 05, 2009 3:31 am
Completed: Boku no Taisetsu na Yumeko
Projects: Twelve, PAW ★ PRINTS
Organization: Kitsch-soft
Location: Illinois, USA
Contact:

Re: Angry computer nerd rant (for your potential amusement)

#3 Post by Sapphi » Wed Jan 04, 2012 4:30 pm

SvenTheViking wrote:Where did these images come from? Why are they here? What episodes are these from? Do you have a functioning brain or did you just bash on a keyboard until something happened?
:lol: :lol: :lol:

This reminds me of the hard drive situation on our family computer (which, thank God, I am not using anymore). I'm not sure, but I think the contents of drives have been dumped into each other alternately as problems have happened... which means now it's a confusing mess of duplicate files, secret folder passageways, shortcuts that go nowhere, old games, pieces of old games, etc. It's kind of beautiful in a morbid way.
"It is [the writer's] privilege to help man endure by lifting his heart,
by reminding him of the courage and honor and hope and pride
and compassion and pity and sacrifice which have been the glory of his past."
— William Faulkner
▬▬▬▬▬▬▬▬▬▬..+X+..▬▬▬▬▬▬▬▬▬▬
Image

User avatar
LateWhiteRabbit
Eileen-Class Veteran
Posts: 1866
Joined: Sat Jan 19, 2008 2:47 pm
Projects: The Space Between
Contact:

Re: Angry computer nerd rant (for your potential amusement)

#4 Post by LateWhiteRabbit » Wed Jan 04, 2012 5:21 pm

Sapphi wrote: This reminds me of the hard drive situation on our family computer (which, thank God, I am not using anymore). I'm not sure, but I think the contents of drives have been dumped into each other alternately as problems have happened... which means now it's a confusing mess of duplicate files, secret folder passageways, shortcuts that go nowhere, old games, pieces of old games, etc. It's kind of beautiful in a morbid way.
I sometimes wonder what the state of file keeping will be on the internet or a super computer in a 100 years. All of human knowledge, filed and named as each generation of individuals sees fit, lost and layered in deeper and deeper sub files, duplicates here, duplicates there. Thousands of formats, hidden folders and directories, memes layered upon memes that depend on archaic knowledge of minute aspects of transient popular culture.

The Computer of Babel, wider than the world and deeper than the sea.

I don't know whether to shudder or smile . . . .

User avatar
Spiky Caterpillar
Veteran
Posts: 253
Joined: Fri Nov 14, 2008 7:59 pm
Completed: Lots.
Projects: Black Closet
Organization: Slipshod
Location: Behind you.
Contact:

Re: Angry computer nerd rant (for your potential amusement)

#5 Post by Spiky Caterpillar » Wed Jan 04, 2012 6:55 pm

So, sixteen gigs of files, of which maybe about four gigs are actually _unique_, organized in approximately four (possibly more, I'll bet that there are a few testrun outputs in there) different, yet impractical ways, all of which need to be accessible through every URL that anyone has ever used to point at them over the past decade, no matter how illogical? Fun!

IMO, the best way to cut the mess down to reasonable size (after backing it up, of course, since no reorganization is complete without leaving a tarball of the original layout somewhere. If you don't fill your disks, hard drive manufacturers will no longer have incentive to drive capacity up!) is to pick the sanest directory structure (probably some form of per-episode folders), then write a script to go through, find the duplicate files, and replace all but the most-sanely-located of them with a relative symlink to the copy you're keeping.

While you're at it, I'd check through to see if there are any files in the mess to which nobody at all is linking because the site itself never linked to them. (I predict that these will be the per-episode folders that I think sound sanest. :) )
Nom nom nom nom nom LEAVES.

User avatar
Sapphi
Eileen-Class Veteran
Posts: 1685
Joined: Fri Jun 05, 2009 3:31 am
Completed: Boku no Taisetsu na Yumeko
Projects: Twelve, PAW ★ PRINTS
Organization: Kitsch-soft
Location: Illinois, USA
Contact:

Re: Angry computer nerd rant (for your potential amusement)

#6 Post by Sapphi » Wed Jan 04, 2012 8:46 pm

LateWhiteRabbit wrote: I don't know whether to shudder or smile . . . .
"Run the Disk Cleanup utility, HAL."
"I'm sorry, Dave... I'm afraid I can't do that. 8)"

(As to your dilemma of which response is most appropriate, I think both, just not in unison, because then you might start to look kind of creepy. :P)
"It is [the writer's] privilege to help man endure by lifting his heart,
by reminding him of the courage and honor and hope and pride
and compassion and pity and sacrifice which have been the glory of his past."
— William Faulkner
▬▬▬▬▬▬▬▬▬▬..+X+..▬▬▬▬▬▬▬▬▬▬
Image

SvenTheViking
Regular
Posts: 65
Joined: Thu Dec 29, 2011 4:19 am
Projects: Chronicles of the Timetraveler's Wars
Organization: ProgMan Productions; Polymorphic Games
Location: Oregon
Contact:

Re: Angry computer nerd rant (for your potential amusement)

#7 Post by SvenTheViking » Thu Jan 05, 2012 3:38 am

UPDATE TIME!

We (my friend, who's hosting the site, and I, who's doing the work, as he doesn't know PHP and is considerably more busy than he should be anyway) have decided that the best thing to do is to leave the images where they are and as they are, fix the abomination of a website to the point where it's usable, build a new one from a decent CMS (I'm using PHP Fusion, as I like how it looks and works), and then nuke the old website from high orbit. The reason we decided to do this is because, as I said, a great many sites are linking to the images in their current structure. I know this because the domain was pointing to another server for all of maybe five minutes before I had the files transferred from one system to the other, and the error log was in the 5MB+ range by the time I got everything deployed properly. Each one of those errors was it complaining that it couldn't find the images or pages which were being requested of it. That's about 1000 requests per minute, or 16 per second, on average.

To put it simply, screwing with that would be just as bad as having let the site die in the first place. However, leaving fixing the current site would be admitting that the guy who wrote it created something worth saving, which he didn't. So, on top of the music I'm making for one game, the code I'm doing for another, everything I'm doing for my own game, the three stories I'm writing, and the classes I start taking on this coming Monday, I'm now installing and modifying a CMS.

I love my life.

On another note, I'd totally call the machine The Computer of Babel if it were up to me.
"Be not afraid of greatness: some men are born great, some achieve greatness, and some have greatness thrust upon them."
William Shakespeare, "Twelfth Night"

SvenTheViking
Regular
Posts: 65
Joined: Thu Dec 29, 2011 4:19 am
Projects: Chronicles of the Timetraveler's Wars
Organization: ProgMan Productions; Polymorphic Games
Location: Oregon
Contact:

Re: Angry computer nerd rant (for your potential amusement)

#8 Post by SvenTheViking » Tue Jan 10, 2012 4:00 am

Quick update... Paraphrased for brevity and to spare you from having to read excessive amounts of profanity.

Him: "The site isn't working. The URLs are broken. Why didn't you use the .htaccess files? You're idiots for not doing what I say."
Me: "Stop bothering us. You are a moron. I do not like you. (More angry insults, summing up my initial post in this thread). 'But [the website is] such an ungodly abomination that it makes the angels who normally sing me sweetly to sleep at night weep with anger and terror.' .htaccess files are not the place to put URLs to which are then referenced in PHP."

For those of you who don't understand referencing .htaccess files for proper URLs, or even what that means, it would be like making a map (the webpage) and having that map's labels (the URLs) be largely nonsensical strings of words and characters, and expecting the person reading the map to read a book you wrote on how to properly make maps. The unwavering arrogance of that man make me want to do very bad things, like steal candy from babies or some other villainous cliche.
"Be not afraid of greatness: some men are born great, some achieve greatness, and some have greatness thrust upon them."
William Shakespeare, "Twelfth Night"

Post Reply

Who is online

Users browsing this forum: No registered users