Fix Instance reachability check failed | AWS | EC2 instance

status check tab

If you are getting instance reachability check failed then you are in right blog. I will discuss here the general investigation, Analysis and Action items that you need to take in order to solve this.

If In your case, the system status checks are passing and the underlying hardware is healthy as well. Therefore, you can safely rule out any issues with the AWS infrastructure to be the cause of the issue. Also, the failure of instance-level status checks indicates the instance/OS level.

General Investigation

If you check all the performance and system-related metrics given below for the instance using CloudWatch graphs. Upon investigation, If you are able to find the following details: –

  1. System status checks – System status checks for the instance did not fail, confirming that the underlying AWS host is fine on which the instance has been residing.
  2. Instance status checks – Instance status checks fail and still continuing till now. These failures can be due to Software (OS) fault, network configuration issue or any issue with the application or any other resources at the instance level.
  3. If you check logs and if you notice that the eth0 adaptor link is not ready.
  4. Upon checking the console logs, also noticed dependency failed errors related to /app file system

instance reachability check failed

Analysis

I would like to mention that this issue is not related due to AWS Infrastructure failure. Please refer [1] to cloud watch metrics showing there are no AWS failures encountered. When you check logs of your instance then you will analyze and if you could see the below findings.

  1. The Login console is struck with the error ‘Cannot open access to the console, the root account is locked.’
  2. Error related to timeout waiting for device dev-mapper-vg_app\x2dlv_app.device and /apps is noticed.

How to find whether the instance is in emergency mode?

[ TIME ] Timed out waiting for device dev-mapper-vg_app\x2dlv_app.device.

[DEPEND] Dependency failed for /apps.

[DEPEND] Dependency failed for Local File Systems.

[DEPEND] Dependency failed for Relabel all filesystems, if necessary.

[DEPEND] Dependency failed for Migrate local… structure to the new structure.

[DEPEND] Dependency failed for Mark the need to relabel after reboot.

[DEPEND] Dependency failed for /apps/.swap.

[DEPEND] Dependency failed for Swap.

I have even attached my logs report below.

Action Plan

1. Launch a temporary instance in the same AZ as the problematic instance.

2. Stop the instance error instance and detach its root volume from it.

Note: When the instance is stopped then Public-IP associated with it changes once the instance is started.

3. Attach the volume to the Temporary instance as a secondary volume(say /dev/xvdf)

4. Start the temporary instance and login to it.

5. Mount the secondary volume, say /mnt is the mount-point:

sudo mount /dev/xvdf1 /mnt

6. create a backup of your config files before editing them. In case of any errors in your configs, you can revert to the default/working file.

cp /mnt/etc/fstab /mnt/etc/fstab_backup

7. Please comment on all other entries in the /mnt/etc/fstab file accept the ‘/’

vi /mnt/etc/fstab

After commenting on all the other entries, save and exit the file.

8. After the above steps unmount the secondary volume from the temporary instance

# sudo umount -l /mnt

9. Detach the secondary volume from the Temporary instance and attach it back to the original instance as root volume(/dev/xvda) Start the Original instance and check if you are able to login to the instance. In case you still face any issues, please feel free to comment.

My Cloud Watch Report [1]

instance reachability check failed

Log File of instance reachability check failed instance

Welcome to emergency mode! After logging in, type “journalctl -xb” to view
system logs, “systemctl reboot” to reboot, “systemctl default” or ^D to
try again to boot into default mode.

Cannot open access to console, the root account is locked.
See sulogin(8) man page for more details.

Press Enter to continue.

*Note

If you had kept data in /dev and that you were unable to find the data under /dev. Then data under the temporary filesystem flushes away once the system is rebooted or restarted. This was in my case.

This will definitely solve the problem of instance reachability check failed. Hope you like it. Feel free to comment. For more Tech Blog Visit InfoHubBlog

Reference:

[1] https://us-east-2.console.aws.amazon.com/cloudwatch/deeplink.js?region=us-east-2#metricsV2:graph=%7B%22metrics%22%3A%5B%5B%22AWS%2FEC2%22%2C%22StatusCheckFailed_Instance%22%2C%22InstanceId%22%2C%22i-08472f91f31dfbde9%22%5D%5D%2C%22stat%22%3A%22Sum%22%2C%22period%22%3A300%2C%22start%22%3A%222021-08-05T06%3A52%3A56.812Z%22%2C%22end%22%3A%222021-08-10T06%3A52%3A56.812Z%22%2C%22region%22%3A%22us-east-2%22%7D

[2] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html#instance-status-checks

[3] https://aws.amazon.com/compliance/shared-responsibility-model/

[4] https://aws.amazon.com/compliance/data-privacy-faq/

 

Be the first to comment on "Fix Instance reachability check failed | AWS | EC2 instance"

Leave a comment

Your email address will not be published.


*