RESOLVED: Outage affecting some Merula services inc. ADSL

15:15  The problem has been traced to one of our core routers which ‘hung’ without (as it should have) notifying the automatic monitoring system of problems. This in turn affected routing for some customers connected to our Telehouse data centre. Some other important switches and routers were also impacted as they were unable to see that this router was down and also failed silently.

We have now restored service to the router and are monitoring for any further issues.

This should not have happened but despite our planning it did and we can only apologise for this; we are working on reconfiguring the core layout very shortly to make sure that this can’t cause such cascading problems for our customers again.

13.30 — we believe that we have resolved these service issues. That said, we are monitoring still and looking for the root cause. Again our apologies for this loss of service and updates will continue to be posted here once we’ve had a chance to check logs etc.

UPDATE: we’re working with our link team as this is mainly affecting services out of our London data centres.  Apologies for this extended down-time, we’re all working on this problem and will update here as we know more detail.

We’re aware of and are investigating the cause of outages affecting a number of services inc. some leased lines, ethernet circuits and broadband lines. As soon as we know the root cause & likely time to fix, we’ll update there.

Broadband session drops

We’ve seen some sessions drop (on ADSL and FTTC/FTTP connections|) following an unexpected port reboot on one of our main switches here.

Most sessions have automatically re-connected but if your line is failing to come back as expected, it may have a stale session which needs to be cleared down. To do this, please power-down your router for 15/20 minutes and then attempt to re-connect.

This should bring you back on-line but please raise a fault with support here in the normal manner if you’re then still having problems.

Possible outage affecting a small number of C20 lines

We are aware of a problem with authentication to our Radius servers in the last 20 minutes that is affecting ONLY circuits delivered as from C20 exchanges — we are working on this now and expect a resolution shortly.

It’s only affecting those lines that have dropped earlier or had been switched off and are now trying to connect (“authenticate”) and not on lines currently logged in all of which should carry on working normally; so is only a problem for a small part of the ADSL estate in Merula.

[Update] URGENT: Leased line outages

 

[Update] 15:45 The engineers have repaired the faults identified and all circuits are up & reporting as being heathy. If anyone is still seeing issues please mail into support@merula.net. We are conducting a postmortem with the supplier to ascertain the root cause, the fix made and to find out why their agreed backup routing didn’t kick in as part of the DR process. We will update the ticket as we know more.

15:10 Multiple back-hauls to us and other customers are affected by this fault; this has been escalated internally by the supplier to their 3rd-level team. No time to fix as yet.

14:26 Engineers have arrived on-site and are starting their investigations.

Engineers are en-route to both ends of this connection.

We are aware of a problem affecting one of our core bearer lines into HEX in London which in turn is affecting a number of customer  leased lines. These are currently hard down.

This is a high-priority outage for us and we are working to get this resolved as quickly as possible and apologise for the down-time being seen. We do not currently have a time to fix but expect a status report within the next 30-45 minutes.

We will post updates here.

Overnight supplier interconnect works; morning of 4th August

Description of Works:

Essential maintenance work is required to protect the performance of Interconnect services. This work will take place during the following window.
Date & Change Window:

4 August 2015 00:00hrs to 4 August 2015 05:00hrs BST

Impact & Outage Duration:

Intermittent during the window 4 August 2015 00:00hrs to 05:00hrs BST
Services Affected: possible transient loss of connectivity on DSL circuits

RADIUS server issues affecting ADSL logins

For a brief period, the RADIUS server here in Huntingdon that authenticates ADSL logins was refusing connections. This is now resolved and any ADSL circuits that weren’t able to login should have automatically come back on line again. If you are still seeing issues, please re-start your router.