Webmin crashes and needs reboot

From time to time my server crashes and needs a reboot. The server is hosted on a Digital Ocean droplet and has plenty of reserve memory. How can I diagnose this issue?

Status: 
Active

Comments

Howdy -- thanks for contacting us!

It's rare to see Webmin crash for a reason other than a resource issue, though we'll certainly help look around to determine what's going on.

The next time Webmin crashes, before restarting Webmin or rebooting, can you run this command:

dmesg | tail -30

Also, do you see any errors in /var/webmin/miniserv.error?

Lastly, do you have a /proc/user_beancounters file? If so, can you paste in it's contents?

Thanks!

Subroutine arpa_to_ip redefined at /usr/share/webmin/bind8/records-lib.pl line 591. Subroutine ip_to_arpa redefined at /usr/share/webmin/bind8/records-lib.pl line 601. Subroutine ip6int_to_net redefined at /usr/share/webmin/bind8/records-lib.pl line 611. Subroutine net_to_ip6int redefined at /usr/share/webmin/bind8/records-lib.pl line 638. Subroutine valdnsname redefined at /usr/share/webmin/bind8/records-lib.pl line 657. Subroutine valemail redefined at /usr/share/webmin/bind8/records-lib.pl line 685. Subroutine absolute_path redefined at /usr/share/webmin/bind8/records-lib.pl line 696. Subroutine parse_spf redefined at /usr/share/webmin/bind8/records-lib.pl line 704. Subroutine join_spf redefined at /usr/share/webmin/bind8/records-lib.pl line 750. Subroutine parse_dmarc redefined at /usr/share/webmin/bind8/records-lib.pl line 794. Subroutine join_dmarc redefined at /usr/share/webmin/bind8/records-lib.pl line 818. Subroutine join_record_values redefined at /usr/share/webmin/bind8/records-lib.pl line 848. Subroutine compute_serial redefined at /usr/share/webmin/bind8/records-lib.pl line 869. Subroutine convert_to_absolute redefined at /usr/share/webmin/bind8/records-lib.pl line 904. Subroutine get_zone_file redefined at /usr/share/webmin/bind8/records-lib.pl line 923. Subroutine get_dnskey_record redefined at /usr/share/webmin/bind8/records-lib.pl line 947. Subroutine record_id redefined at /usr/share/webmin/bind8/records-lib.pl line 969. Subroutine find_record_by_id redefined at /usr/share/webmin/bind8/records-lib.pl line 979. Subroutine get_dnskey_rrset redefined at /usr/share/webmin/bind8/records-lib.pl line 998. Subroutine is_raw_format_records redefined at /usr/share/webmin/bind8/records-lib.pl line 1020.

No errors or data in /proc
Webmin has not crashed again yet.

Thanks

Hmm, don't see anything too unusual in your error log there. Those are just some notices that are safe to ignore.

What I'd suggest is to keep an eye out for the next time it happens, and if/when it does, take a look at the "dmesg" output. It may contain some helpful info to debugging what you're seeing.

Hello Crashed again here is result of "dsmeg"

[617108.408845] [ 9984] 33 9984 116766 2092 148 4 0 0 apache2 [617108.408849] [ 9985] 33 9985 116766 2092 148 4 0 0 apache2 [617108.408852] [ 9986] 1023 9986 90066 10484 119 3 0 0 php5-cgi [617108.408856] [ 9988] 33 9988 116751 2109 147 4 0 0 apache2 [617108.408860] [ 9989] 33 9989 116751 2109 147 4 0 0 apache2 [617108.408864] [ 9990] 33 9990 116751 2109 147 4 0 0 apache2 [617108.408868] [ 9991] 33 9991 116766 2092 148 4 0 0 apache2 [617108.408872] [ 9993] 1023 9993 89457 6759 114 3 0 0 php5-cgi [617108.408876] [ 9994] 1023 9994 89297 6304 112 3 0 0 php5-cgi [617108.408879] [ 9998] 33 9998 116720 2053 145 4 0 0 apache2 [617108.408883] [ 9999] 33 9999 116720 2053 145 4 0 0 apache2 [617108.408887] [10000] 33 10000 116748 2082 147 4 0 0 apache2 [617108.408891] [10001] 33 10001 116748 2082 147 4 0 0 apache2 [617108.408895] [10002] 33 10002 116744 2060 146 4 0 0 apache2 [617108.408898] [10003] 33 10003 116720 2053 145 4 0 0 apache2 [617108.408902] [10004] 33 10004 116720 2053 145 4 0 0 apache2 [617108.408906] [10005] 33 10005 116748 2082 147 4 0 0 apache2 [617108.408910] [10007] 1023 10007 88480 4271 109 3 0 0 php5-cgi [617108.408914] [10008] 1023 10008 89169 5040 109 3 0 0 php5-cgi [617108.408918] Out of memory: Kill process 1458 (mysqld) score 58 or sacrifice child [617108.409049] Killed process 1458 (mysqld) total-vm:1084176kB, anon-rss:120260 kB, file-rss:0kB [617108.456379] init: mysql main process (1458) killed by KILL signal [617108.456413] init: mysql main process ended, respawning [617108.543094] audit: type=1400 audit(1494786641.753:17): apparmor="STATUS" ope ration="profile_replace" profile="unconfined" name="/usr/sbin/mysqld" pid=10020 comm="apparmor_parser" [617108.900986] init: mysql main process (10035) terminated with status 1 [617108.900999] init: mysql main process ended, respawning [617109.786239] init: mysql post-start process (10036) terminated with status 1 [617109.802487] audit: type=1400 audit(1494786643.013:18): apparmor="STATUS" ope ration="profile_replace" profile="unconfined" name="/usr/sbin/mysqld" pid=10060 comm="apparmor_parser" [617109.857225] init: mysql main process (10072) terminated with status 1 [617109.857239] init: mysql respawning too fast, stopped

Thanks

Okay, it does look like you're running into memory problems there.

The Linux kernel is killing off processes in order to keep the server up and running.

You may need to review all the running processes, and ensure that none are using up a large amount of memory. Sometimes MySQL can get particularly large, for example.

Also, you may want to configure Apache's maximum connections to reduce how many are allowed at once. It's possible that you're receiving bursts of Apache traffic that are causing problems.

Also, what is the output of the command "free -m"?

This is a bit beyond my skillset! I have 171 running processes Real memory: 1.95 GB total / 1.24 GB free / 1.06 GB cached Swap space: 0 bytes total / 0 bytes free 1464 mysql 866.28 MB /usr/sbin/mysqld Not sure what I should do next, ,maybe go back to using NGINX?

FREE -M

total used free shared buffers cached

Mem: 2000 1495 505 513 38 759

-/+ buffers/cache: 697 1302

Swap:

I wouldn't switch to Nginx, I'd recommend continuing to use Apache, and we just make a few tweaks to your setup there.

First, do you have the ability to add in swap space?

512MB of swap can go a long ways.

Second, what is the output of this command:

grep MaxClients /etc/apache2/apache2.conf

I can add swap space and report back np.

grep MaxClients /etc/apache2/apache2.conf = NO OUTPUT

Hmm, try adding a "-i" to that, so it'd look like this:

grep -i MaxClients /etc/apache2/apache2.conf

StatusFileSize
new4.62 KB

Swap created, will monitor Thanks

grep -i MaxClients /etc/apache2/apache2.conf still = NO OUTPUT

Hmm, that may be located in a different file on your system there.

How about this command here, it'll search all the Apache configs:

find /etc/apache2 -type f | xargs grep -i maxclients

This is the nearest config I can find but no directive for max clients.

Darnit, I'm sorry, it looks like new installs of Ubuntu 14.04 change the setup a bit, and don't use MaxClients.

I have "MaxClients" on my test system here, but only because it was upgraded from an older system.

Here is what I'd do -- I'd edit the config file you shared above, as that's exactly what we need to tweak -- and in that file, I'd change MaxRequestWorkers from 150 to 75, and then restart Apache.

Having that set to 150 can be a bit too much at times, and can allow servers to get a bit overloaded at times if they don't have enough RAM to handle 150 requests at a time.

Done that, will monitor and see what happens. Thanks for your help!

StatusFileSize
new5.58 KB

Hello Crashed again, attached is result of dsmeg command. Thanks

I have a few ideas that should help, but one quick thing -- how many emails are you your mail queue right now?

You can determine that by running this command:

mailq | tail -1

This is the output:

-- 524 Kbytes in 7 Requests.

Okay, it doesn't look like there are many.

Is the ClamAV service being hosted on a different server in your setup there?

Thanks, from the output there it looks like it's set to use the standalone scanner.

That's okay, though during a burst of email it could end up using a lot of RAM, which could be part of the issue you're experiencing.

My suggestion would be to change the "Virus Scanning Program" option from "Standalone" to "Server Scanner".

That should cut down on a lot of RAM usage.

Getting an eror when i try to activate it

"The server virus scanner cannot be selected unless the clamd virus scanning server is running"

Update : I enabled the ClamAV server and rebooted then the error dissapeared. I will try this setting see what happens Thanks

You may want to ensure the ClamAV service is running. You can do that in Webmin -> Servers -> Bootup and Shutdown.

@andreychek I'm facing the exact same problem. Can you elaborate on how to change the "Virus Scanning Program" option from "Standalone" to "Server Scanner"?