Current ubuntuusers.de and doc.ubuntu-fr.org downtime.

Due to problems two servers are currently moved into another rack. I can ensure you, we are as eager as you to get them back online.

Sorry for the inconvenience!

1 comment September 17th, 2008

Issues for the last week/weeks - explanation

A lot of discussion has been going on about why the servers have been so slow recently. It is actually fairly complex so let me explain.

We used to have one server for Apache, with one server for Mysql (replicated for backups). We started a move last october to a load balanced architecture for greater scalability, 2 apaches, one NAS (Netapp provided by Noris.net, our main provider, that also hosts all our servers), and one replicated MySQL. The french locoteam was the first to move to that new platform, the other ones staying on the old apache server.

At the beginning of summer we got kicked out of the NAS because of a too heavy usage. We managed to get a new server, Nun, in urgency, which we sent from France to Nuremberg. Nun is a dual xeon 3Ghz, 4GB RAM, 5×70GB SCSI 10k - a reasonable server for NFS. Sadly the performance appeared to be very bad, which resulted in a very slow response time. This even got to a catastrophic state once we tried to move our wiki to the NFS; Dokuwiki was actually the problem here. So we left dokuwiki running on one server only (without using the NAS) while we were investigating; we started to find and report some issues. Further improvements to dokuwiki’s code have been made since then.

Even without the wiki the performance on our NFS still were a lot worse than expected - and it wouldn’t support other locoteams moving to it. After a long investigation by smurf, it appeared that there was a bug in Hardy’s kernel, which was flooding the raid card (bug reported here). We then moved the ubuntu-de loco website to only one of the new Apache server (the ubuntu-de old portal wasn’t made to be load balanced) - and used the old apache server as NFS server instead. Asa is a dual xeon 3Ghz, 4GB ram, 3×70GB SCSI 10k disks - quite comparable to nun - though it was running Dapper and not Hardy. And we found the performance to be nearly ten times better. So Nun wasn’t used anymore, and we were running NFS on the previous apache server that was actually still serving some smaller locos.

At this time we also had some big issues with our load balancer - we are using HaProxy. For some reason, some files were systematically returned with a 504 error - although they were served fine when accessing directly the backend server (without going through the reverse proxy).  It took us a long, very long time to spot this issue. It appeared to be a bug in HaProxy; when one specific option turned on, it was incorrectly parsing the headers of the files and failed to pass it back. The issue has been reported and will hopefully be corrected later, for the time being we disabled that option. The bug was also present in wget.

A couple of weeks ago Ubuntu-de changed completely their portal system, to a home-made one. This involved several issues. The move to a database-based wiki added load to an already loaded SQL server. The portal was also using mod_wsgi which apparently was having big issues, apache processes kept forking, number of workers was reaching limits without any apparent reason, resulting in a massive slowdown for web requests.

One of the biggest bottlenecks was still the SQL though - so we bought 2×1GB memory on ebay, using the data from “lshw” to chose the correct ones. Took three days to arrive, sadly, they weren’t the good ones :/ We may still use them on some other server, though. The SQL was also running a software raid5, which wasn’t optimal -> we wanted to switch to raid1. Following that failed memory upgrade + raid change, the server started acting weirdly; things that should have worked, just didn’t. Even after a full reinstall it kept acting weirdly. Smurf reported it was faulty, and moved the SQL to the NFS server - which was already doing NFS, and apache for a couple of smaller websites - and also had slower disks.

So the SQL has been terribly slow for the last week or so. This morning, smurf bought another 2GB RAM on ebay, the correct one this time. He set it up late this afternoon, and reinstalled the server with Hardy. It is apparently working fine, we hope to get the SQL completely back tomorrow.

At the same time we have made huge improvements to the SQL configuration and to some of the requests. We noticed a missing key in one of punbb’s tables, removed completely any reference to punbb’s search (which was a *huge* bottleneck, and was locking some tables for a long time, resulting in a “frozen” forum). The configuration has been improved a lot as well.

At the same time the german locoteam has been looking at why their portal was making Apache run so many workers. They eventually found an issue which may have been causing it and corrected it, so this issue should be solved as well.

For the curious, this is the platform we will then be using.

3 comments August 29th, 2008

Redirection from ubuntuusers.de?

Hello,
Due to a faulty configuration the website “ubuntuusers.de” redirects to this page. Please delete your browser cache and try it again.

Hallo,
Aufgrund einer Fehlkonfiguration leitet “ubuntuusers.de” auf diese Seite. Bitte lösche den Cache deines Browsers und versuche es noch einmal.

August 27th, 2008

Ubuntu-ru up again !

Hi,

It’s been a couple of weeks of hard work for the people at Ubuntu-ru, but they finally made it : Ubuntu-ru is up again !

Check out their forum !

Add comment October 28th, 2007

Ubuntu-ru shut down due to security problems.

Ubuntu-ru.org has just been shut down following suspicious activity with their account. The case is being investigated.

12 comments October 8th, 2007

[update] Server move …

Our servers will move from Paris (France) to Nuremberg (Germany) next monday and tuesday.

The start of the downtime is monday evening or tuesday morning, depending on the data centre in Paris.

If you have any questions concerning this move please do not hesitate to ask us.

[Update 2007-06-15-08:30 UTC]

We should be back up on 2007-06-19, 16:00h UTC. Hopefully.

That means for our hosted communities:

Portugal: 17:00 local time (Lisbon, UTC +01:00)
Norway: 18:00 local time (Oslo, UTC +02:00)
France: 18:00 local time (Paris, UTC +02:00)
Germany: 18:00 local time (Berlin, UTC +02:00)
Iran: 19:30 local time (Teheran, UTC +03:30)
Russia: 20:00 local time (Moscow, UTC +04:00)

[Update 2007-06-18-14:47 UTC]

We will get the servers at tuesday morning from the datacenter. So everything will be in place this evening.

[Update 2007-06-19-04:30 UTC]

Servers are down. Wish Matthias Godspeed for the journey back to Nuremberg.

[Update 2007-06-19-16:00 UTC]

Except Norway, we are back online.

1 comment June 15th, 2007

[update] eshu.ubuntu-eu.org blacklisted for 1&1

24.05.2007:

Unfortunately, the provider 1&1 took eshu.ubuntu-eu.org on his blacklist.

As a result of this, people who have their mail address at this provider will not get any mails until this problem is solved.

We are trying hard to solve this problem as soon es possible.

Update 30.05.2005

All issues are solved, we are not on any blacklist by now.

Add comment May 30th, 2007

New SQL server coming soon

Zinobrettarsis

Hello everybody, both our SQL servers (mawu and lisa) are slowly but surely running out of memory, resulting in heavy disk usage… and bad performance.

We just ordered that server (upgraded to 4GB RAM) to replace both of them. A big thanks to Zinobrettarsis for making us a special and really fair price.

It ought to be enough for a long time ;)

 

2 comments April 4th, 2007

Welcome to Ubuntu-ru

We are happy to announce that ubuntu-ru.org joined us on our servers :)

Add comment March 19th, 2007

Server reboot in the early morning hours of March, 5th

Due to important kernel updates we have to reboot the servers at 4:00 GMT.

Expected downtime: 5 minutes.

2 comments March 2nd, 2007

Previous Posts

Categories

Links

Feeds