Switch Issue – Telehouse West [resolved]

We are aware of an issue affecting one of our core switches in Telehouse West. this is also causing some disturbance in connections on other traffic intermittently.

Our engineers are working on this and we hope to have an update in the next 30 minutes

[resolved 01:30]
On-site engineers have rebooted one of our core switches in Telehouse West ans service is now back. Our network engineers are investigating the cause of this and we will work to ensure there is no further disruption here

Emergency Planned Work – Telehouse North Switch reboot 10/3/22 10pm [complete]

Unrelated to the issue yesterday – one of our Juniper switches in Telehouse North is reporting errors and is causing routing to toggle between two routes which is causing some latency issues etc.

Under recommendation from our vendor we will reboot the routing engine between 10pm and midnight this evening.

This will affect customers directly connected and will mean that some Broadband customers who terminate in Telehouse may see their connection drop and re-connect to an alternative LNS

We will update this post once the work completes. We are sorry for the short notice but would rather resolve this now before there is any risk of an unplanned outage as a result

Single Virtual Host Down [Resolved]

We are aware of a Single VM Host currently down which is being worked on now. Our Engineers are migrating the Virtual Machines to an alternative host. We will update this post within 30 minutes with an update

Just an update – this should not affect most users – unless we host a site for you on this particular server. We are currently migrating the data and drives to new hardware and expect to have this back up and running very shortly

The Affected host is now back up and running – the Virtual Machines have been moved to alternative hardware and we see all servers now running as they should. We are sorry for any issues this may have caused.

UPDATE: Outage on some broadband lines [13/10/2021]

14:46 UPDATE
The supplier advises their issues should now have been resolved, however it may take some time for all circuits to reconnect due to the increased load of subscribers attempting to login to their Radius. Our apologies again for this supplier outage.
13.20 UPDATE
Good afternoon.
The supplier had a power outage at one of their data centres that’s affected their network transit and backhaul and therefore some

lines. We are seeing some lines return but not all and will update here as we get more news.

————-
Good afternoon.
We are aware of an issue within one of our suppliers we use for these lines. At this point we don’t have an announcement or full details but we are aware of this. The issue is also affecting their portal which we access and their status page isn’t working.
We are trying to get an update on this which we will pass on as soon as we can.
Note: this issue is not on the Merula network but within the supplier network. Our apologies to this circuits affected.

Issue with hex.cr1 – RESOLVED

We are aware of an issue with one of our Juniper Core Routers in Harbour Exchange Square.

The vast majority of services have routed round this and are not affected. As small number of directly connected customer may be seeing an issue

Our Engineers are working to restore service to this router. We will update this as we know more

[Update 18:40]
A remote hands reboot of the core router has not restored service. Our Engineers are therefore en-route to the data centre to investigate and restore service – we expect them to be on site by approx 20:15 this evening. We will update as soon as they arrive.

[Update 20:15]
Our Engineer is approx 20 minutes from the Data Centre they have collected spare parts on route in case any hardware needs swapping out. Next update by 21:15

[Update 21:15]
Our engineer is on site and working on the core router. We can see the file system is corrupt which means the router did not boot when power cycled. We are working to restore this asap. In parallel we have moved some lines onto the switch on site which is working. We are moving other links as well to restore their service. Next update within an hour

[Update 22:45]
The Router is now booted after the disk corruption was cleared. Config is being copied back & applied. We hope to have service restored very shortly. Next update by 23:30

[Update 23:20]
The router is now back and passing traffic – we have not yet enabled all peers so some traffic may take a slightly different route to ‘normal’ but no customer services are now impacted. As part of the recovery the software upgrade planned for this router has been applied so the planned work for that upgrade is no longer needed

[Update 23:30]
During the checking process we have detected the need for a reboot to ensure the router and config are updated completely. The reboot is in progress and we will expect this to complete in 10-20 minutes

[Update 23:50]
The reboot cleared the router alarm and all routing is now back up. Monitoring is now showing all links working as they should be. There will be some further checks to complete – however we do not expect any further issues. We apologise for any customers affected by the issues this evening.

 

UPDATE: 14th April 20:45

Once again, please accept our apologies for the problems you’ve seen over the previous couple of days. We realise that this has caused you all serious issues and for that, we’re very sorry.

Various internal changes have been implemented over the last 48 hours and currently, we believe that the network and associated services are now stable and will remain that way. We continue to monitor the situation closely to ensure that our network remains stable and there’s no further impact to your services.

Please email us in the normal way if you have any question or concerns. Thanks again for your support through this incident.