Enjoy the new WePlayCiv theme!

Any feedback is welcome, just go to this topic: 

 

 

Petek

Apolyton SMACX Archives Will Disappear

36 posts in this topic

Robert Plomp announced on 'poly that their old server will be taken offline this Dec. 17. The significance of this event for the SMACX community is that the 'poly SMACX archives, as well as other threads such as the demo game threads, will disappear. The archive files cannot be accessed off the mail 'poly menu. Instead, use this link to view the archives.

 

IMO, not everything in the SMACX archives is worth saving. However, it contains lots of strategy discussions that are still pertinent today. With that in mind, I downloaded the 125 threads listed in this post to my hard drive. Robert told me that he would make sure that these threads would be made available to others.

 

I'm posting this as a "heads up" for anyone who wants to save other archive threads. I used Firefox with a special add-on to save the threads. I'd be glad to offer advice how to do so to anyone who so requests.

 

Petek

Share this post


Link to post

I will attempt to make some backups (particularly of the fiction section). If I can, I'll set up a forum scraper to take it all, but I'm not sure whether I'll be able to find the software I need and I doubt I'll have the time to make some to do it.

 

Update: So, I'm backing up the whole website, or attempting it anyway. I'll report back on my success or not later.

Edited by DrazharLn

Share this post


Link to post

200MB so far, if nothing else it might prove useful as testdata for some parsing programs I want to write.

Share this post


Link to post
Robert Plomp announced on 'poly that their old server will be taken offline this Dec. 17. The significance of this event for the SMACX community is that the 'poly SMACX archives, as well as other threads such as the demo game threads, will disappear.

 

whoa.png

Share this post


Link to post

Petek, I'd be interested in your method of archiving the threads. I saved a part of Acdg3 but it was yery time consuming and tiresome.

Share this post


Link to post

My method probably doesn't save that much time, but here it is:

 

1. I installed the Firefox addon Save Complete. This enhances the Firefox command File --> Save Page As ... by adding better formatting and more graphics to the saved pages. The new command is File --> Save Complete Page as ... .

 

2. I saved all 125 links in Minute Mirage's list to a text file and then replaced "apolyton.net" with "mail.civilization4.net".

 

3. I then added all these links to a web page, so I now had working links to all the pages in the list.

 

4. I then used the Save Complete Page As command repeatedly to save the pages to my hard drive. I had to give each link a unique descriptive name.

 

Once I got into the swing, I used the clipboard and various ad hoc techniques to save time. Total time spent was probably 5-6 hours spent over two days, with most of the time devoted to the actual saving.

 

Perhaps DrazharLn's project includes the files in which you're interested.

 

Petek

Share this post


Link to post

I've downloaded some 4GB from the website since I posted. I have no idea when the job will finish. Hopefully before it reaches 50GB.

Share this post


Link to post

By the by, I backed up the whole website, sans user profiles and other stuff that you need to login to get at.

 

It comes to about 5.5GB IIRC.

Share this post


Link to post
By the by, I backed up the whole website,..

:b:

 

now, any idea on how to make it available to browse?

Share this post


Link to post

I can browse it here locally, and I suppose I could set up a webserver on this machine, but I'm not keen to leave this on all the time. I can't remember if I'm allowed to do that on the Uni network either.

 

In the slightly longer term, I am looking to acquire some web-hosting equipment of my own, and I wouldn't mind hosting it then, until then, if anyone has a particular thread or w/e that they'd like to look at, I can upload the files for that one thread.

Share this post


Link to post

DrazharLn,

 

Great work! Just to make sure I understand, did you back up the threads from the mail.civilization4.net domain? I suggest, for the sake of redundancy, that several people have a copy of the files. Can they be compressed to fit on a single DVD?

Share this post


Link to post

What is this compression you speak of?

 

hehe3e.jpg

 

 

5.5 GB will already fit on a single dual-layer DVD.

Share this post


Link to post

I have every publically accessible file (that is, those that didn't require login to acquire) that lived in mail.civilization4.net/forums/ or a subdirectory. This includes all avatars, threads and attached items.

 

It does not include user pages and maybe some stuff I forgot to test for. Also, things like the search and posting was handled by server-side stuff that I couldn't access, so they don't work any more.

 

However, it should be easy enough to search the threads in the form they're in with standard desktop search software, indeed, you should be able to search much more thoroughly than the server did. It's compressed somewhat already at 3.09 GB. The best way to distribute the files would probably be with torrents, but I cannot use torrent software on the Uni network. If you want, I have a few DVDs that I could burn and send off to whomever.

Share this post


Link to post

For the record, unless 'Poly is closing down, I don't get why they're closing the archives down, it's not that they're particularly large or would have got a lot of traffic. Perhaps they were worried about possible security holes from the old forum software or something.

Share this post


Link to post
It's compressed somewhat already at 3.09 GB. The best way to distribute the files would probably be with torrents, but I cannot use torrent software on the Uni network. If you want, I have a few DVDs that I could burn and send off to whomever.

 

Can't use torrents even for legitimate traffic? The world has become a sad place.

 

I have created an ftp account on my server you can use if you guys just want a short term place to use for transferring the file around - Draz can upload it and others can just use a web browser to download it. Just PM me for the account and password Draz.

Share this post


Link to post
For the record, unless 'Poly is closing down, I don't get why they're closing the archives down, it's not that they're particularly large or would have got a lot of traffic. Perhaps they were worried about possible security holes from the old forum software or something.

 

The short answer is that 'Poly migrated most of their site to a new server, but not the archives. I don't know the reason for that decision. They are still paying to maintain the old server, but will stop doing so to save money. I've had some more communications with Robert and he hopes to save everything for possible future use.

Share this post


Link to post

Chuft: It's so annoying :<. The Uni Internet Service just blocks torrent traffic indiscriminantly. I could *probably* find a way past, but there's a fairly hefty fine and they cut you off from the internet for 2 months if you do and they catch you.

 

I'll PM you in a moment, I love compression steph, btw :)

 

Petek: I suppose they must have their reasons, even if I can't see them.

Share this post


Link to post

I'm currently compressing the archive in preparation for uploading them to chuft's ftp server. I've written a readme file for anyone using the archive and a very simple html file to make the archive slightly easier to use.

 

Note: to extract the archive, you will need to have something capable of opening 7z files, I recommend 7-zip. I used 7z as opposed to zip format because it gives a better compression ratio, this means that I need to upload less data to chuft's server and that you in turn need to download less. Therefore, it helps reduce (albeit, slightly) the bandwidth costs. Apologies for the inconvenience if you haven't needed such software before.

 

Here is the text of the readme, if you're interested:

======================================
Apolyton Archive Backup Project Readme
======================================

Why hello there!

This is a simple archive of the old apolyton website. Unfortunately, I was 
unable to capture any part of the website that required a username and password
to access, so there are no user profiles. Included, however, are all the 
threads that were previously accessible at mail.civilization4.net (and before 
that at apolyton.net).

Additionally, some functionality of the website relied upon server side 
functionality that simply is not represented in this simple html backup (read 
on to the details section if you want to know why this is). This includes the 
search facility.

If you want to search the archives, you would be better off using some desktop 
search software on the files.

As a last note, I didn't go to any great lengths to confirm the safety of these
files and I take no responsibility for any loss or damage that may result from 
your use of the files.

=======
Erratum
=======

The previous version of this readme (the one packaged in the archive you need 
to download) states that you should download all the archives, however, the 
files compressed better than I had anticipated and all fit in one 600MB 
archive.

So there is only one archive file to download and extract.

===
Use
===

To browse the archive, download apolyton-archive-backup.7z and extract to some 
directory on a drive with at least 3.2GB of free space (you will need 7zip for 
this). Then open the file "open-archive.html" in your browser of choice to get 
at the forum archive index page.

Happy browsing.

=======
Details
=======

I made this backup using a simple tool called HTTrack Website Copier, while 
competent, this software only records the html of each page of the website. A 
real forum actually uses lots of server side software to generate those HTML 
pages for you as it goes, this server side software is not accessible to the 
common user and was not saved in this backup.

All files gathered from this process are in the site-rip directory, should you 
want to search them or access them directly.

If there is sufficient interest and/or I take the fancy I may make something a 
little more functional than this HTML rip. Until then, the archive will be 
quite unwieldy.

=========
Copyright
=========

... probably belongs to the respective posters and/or the administration of 
Apolyton.net or others in the case of the other websites ripped, can't say I 
could be bothered to look it up.

As far as I know, this archive can be used for whatever you want.


======
======
Thanks to chuft for the temporary hosting.

Regards,
DrazharLn

 

Edit: Oh yeah, I'm splitting the archive into 650MB chunks to make them easier to download over a period of time and to make CD backups easy. You should download all the chunks before extracting.

Edit2: The archive compressed much better than I had anticipated and is just 597MB in size, so I haven't split it at all. It'll probably take ages to extract if you have a slow PC (took about 15min to compress on mine), sorry :p.

 

You can get the readme and archive from chuft's ftp server now. The readme in the archive is slightly older than the others, check out the erratum for the changes.

 

That's all for now, folks.

Edited by DrazharLn
archive completed upload.

Share this post


Link to post

@Draz, chuft: thanks guys.

 

 

 

***EDIT***

 

 

i'm downloading the file right now.

Edited by bdanv

Share this post


Link to post

bad news :(

 

the archive is not complete. you can see that if you browse it to, let's say, page 32 of the 'AC-General/Help/Strategy-Archive'. the browser won't be reading it from the save on your hard disk, but from the internet (as the link is still active).

 

i tried to backup using the same software but i finally gave up when i realized that the size of the proper save would be to big. however i'm sure it can be done if someone has the time to play with the mirroring depth and all the other preferences.

Edited by bdanv

Share this post


Link to post

How irritating. It seems the link recursion isn't deep enough.

 

Edit: I set link recursion to inifinite, so that shouldn't have been a problem. Investigating.

Edited by DrazharLn

Share this post


Link to post

I've set up a new trawl but I don't know how long it will take or if it will be more successful.

 

Sorry guys, I should have tested the backup (much) more thoroughly.

 

To get this done properly I should have written something custom to do the job rather than relying on this tool.

Share this post


Link to post
..Edit: I set link recursion to inifinite, so that shouldn't have been a problem. Investigating.

that should do, but the size of the save will be huge :eek:

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now