Dec 4, 2017 | Information, Outages, Update
UPDATE2: initial indications on further analysis are that the DDoD that was targeted at the network core which triggered one of our core routers to cease routing packets correctly. However, it did not shut-down the BGP protocols as part of this with the result that network traffic was black-holed for a large number of destinations for a period of around 30 minutes. Service was restored after the routing processes were re-started in sequence. We will continue to investigate further and update here but we believe that all services are now returning to normal. Some routers may have a stale session and will require a reboot to bring them back on-line.
UPDATE: we appear to have been the subject of a wide-spread DDoS attack (the source of which we’re currently still investigating). This caused two of our core routers to become unresponsive and this adversely affected DNS and routing outside our network. We have mitigated the attack here and all of the network is coming back on-line now. Please accept our apologies for this outage — we are aware that it affected a large section of our user-base.
We are aware of an issue affecting large sections of our network centred in London. We are working urgently to fix this and will update here as work progresses.
Nov 23, 2017 | Information, Outages, Planned Work, Update
UPDATE: This work is completed
UPDATE: this should have read THE, apologies.
We are planning a UPS replacement in our data centre in THE following recent problems caused by it, as the rack appears to be at risk of a power loss whenever we work near it. We have a replacement and we plan to swap to that and then investigate at a later date the cause of the issue on the current UPS.
This will mean a brief downtime of some hardware in the rack. Currently only ADSL/FTTC lines will be customer affecting there, which should just drop and then auto connect at another of our POPS. This will happen during the course of Saturday evening.
If after this work is completed and a router reboot, you’re still unable to connect, please contact support in the usual way.
Oct 11, 2017 | Information, Outages, Unplanned downtime, Update
UPDATE:
We have seen the services starting to recover and our normal traffic profile is virtually back to normal. Any subscribers still to reconnect may require a router reboot if the issue persists.
The fault is still open with our supplier until the overall service has been restored. Our apologies again to those affected.
+++++++++++++++++++
One of our back-haul providers is aware of an ongoing issue affecting a small section of our lines which is causing either packet loss or intermittent connectivity or sometimes both. NOTE: This isn’t affecting all lines but the following STD codes are those seeing issues through this supplier. We expect an update by 14.30. In the meantime, we apologise if your line is one of those affected.
01171 01173 01179 01200 01214 01282 01372 01483 01485 01512 01513 01514 01515 01517 01518 01519 01527 01553 01604 01628 01905 01932 02010 02011 02030 02031 02032 02033 02034 02035 02070 02071 02072 02073 02074 02075 02076 02077 02078 02079 02080 02081 02082 02083 02084 02085 02086 02087 02088 02089 02311 02380
Sep 17, 2017 | Information, Outages, Unplanned downtime, Update
10:23am UPDATE: the supplier reports that the problem has been resolved and we believe that all circuits are now back online. Affected circuits may need to reboot their router to bring their session back on stream.
The following exchanges are affected by this issue since 6.21am this morning.
BT and the supplier engineers are en-route to work on-site. No time to fix yet but we will update here as we hear more.
Exchanges affected include Barrow, Buntingford, Bottisham, Burwell, Cambridge, Crafts Hill, Cheveley, Clare, Comberton, Costessey, Cherry Hinton, Cottenham, Dereham, Downham Market, Derdingham, Ely, Fakenham, Fordham Cambs, Feltwell, Fulbourn, Great Chesterford, Girton,Haddenham, Histon, Holt, Halstead, Harston, Kentford, Kings Lynn, Lakenheath, Littleport, Madingley, Melbourne, Mattishall, Norwich North, Rorston, Science Park, Swaffham, Steeple Mordon, Soham, Sawston, Sutton, South Wootton, Swavesey, Teversham, Thaxted, Cambridge Trunk, Trumpington, Terrington St Clements, Tittleshall, Willingham, Waterbeach, Watlington, Watton, Buckden, Crowland, Doddington, Eye, Friday Bridge, Glinton, Huntingdon, Long Sutton, Moulton Chapel, Newton Wisbech, Parson Drove, Papworth St Agnes, Ramsey Hunts, Sawtry, Somersham, St Ives, St Neots, Sutton Bridge, Upwell, Warboys, Werrington, Whittlesey, Woolley, Westwood, Yaxley, Ashwell, Gamlingay and Potton
We are aware some other exchanges may be impacted
Update – we have just started to see some circuits recover but have no update from the carrier as yet
Jun 13, 2017 | Information, Outages, Unplanned downtime, Update
RFO 09:40am:
To fix an issue with transient IPv6 and other intermittent routing issues we had seen recently, we were obliged to upgrade the software on one of our core routers. This holds live and backup routes that allow a smooth failover in the case of a single router failing in London. However it now (in an undocumented change from the software supplier) appears with the latest software that the routers set themselves as live on both primary and backup routers – resulting in a routing loop for some IP addresses which had static IP routes originating from this one affected router thus not correctly falling over as previously was the case.
Again, please accept our apologies for this short outage. It shouldn’t have happened.
We are aware of the cause, the problem has now been fixed on this, the one affected router and we have also made sure that all others in the network have been checked and we are confident all are now running properly.
UPDATE 09:27am:
We are aware of the root cause — located at a core switch in our London locations — and are working on bringing this back into service. No ETA yet but we expect this to be resolved shortly. Apologies for the downtime some of you are experiencing.
09:09am We are aware of reports of leased lines down and are investigating. More updates here as we know the cause & ETA to fix.
May 3, 2017 | Update
This was resolved approx 4pm after the faulty switch was swapped on the supplier network
[Update at 14:46]
The supplier is advising us that most lines are now returning on-stream; this may take a few more minutes as the Radius catches up. Anyone still affected after this time should power-off their router for at least 20 minutes to clear any stale session. Please email into support@merula.net if this fails to bring you back live.
We apologise for the lengthy downtime and are looking at further remedial work with the supplier to ensure that such a failure doesn’t affect us in future.
[Update at 14:33]
Apologies for the lack so far of anything concrete in time-to-fix terms; we are escalating this to senior managers inside the supplier to get this fixed.
[Update at 13:33]
Senior engineers are currently on site working on the faulty hardware.
Further updates will be posted once the work has been completed.
[Update at 12:35]
Supplier update: We’re seeing a partial recovery on the equipment.
We’re aware some circuits are still down, our engineers are looking to replace some of the
hardware in the switch stack.
Further updates will be posted when available.
[Update at 12:10]
The supplier has a new switch on route to the site to be swapped out — they’re expecting this to complete by 1pm. We’ll update as this progresses.
We are aware of a problem affecting one of the interconnect switches on a transit supplier network which means that a number of lines dropped earlier this AM and are still down; they and we are working on getting this switch bypassed and replaced. Currently we have no time-frame for a fix but believe this will not be service affecting for too long.