Discussion:
Radius Server stuck and stop responding.
Awais
2012-05-31 04:56:17 UTC
Permalink
We are using radius server for IPTV services.
It was running fine but 2 days before it stuck and stop responding. But
after restart it is working fine now, i want to know what may be the problem
because of which it stuck and stop responding. It is on trail so they want
to know the reason. It will also help me in future to remove that bug.


--
View this message in context: http://freeradius.1045715.n5.nabble.com/Radius-Server-stuck-and-stop-responding-tp5713440.html
Sent from the FreeRadius - User mailing list archive at Nabble.com.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Fajar A. Nugraha
2012-05-31 05:27:51 UTC
Permalink
Post by Awais
We are using radius server for IPTV services.
It was running fine but 2 days before it stuck and stop responding. But
after restart it is working fine now, i want to know what may be the problem
because of which it stuck and stop responding. It is on trail so they want
to know the reason. It will also help me in future to remove that bug.
Start with saying what version you use, and how you install it on what
OS. Some combinations have known bugs.

Also, check what radius log says (usually in /var/log/radius or
/var/log/freeradius). It has very useful message in some cases (e.g.
when the backend db doesn't respond fast enough).
--
Fajar
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Awais
2012-05-31 06:08:16 UTC
Permalink
Red Hat Enterprise Linux Server release 5.5

Installed by using tar.gz file, 1st untar it and than make make install etc.

Database is on other server.

And please also tell me how can i know, at that time how many request are
under process?

Part of log file is below:


Query to execute ::
CALL
Proc_Ngi_IPTV_Authentication('00606E71E0D6','E8D9BE8F','0','E8D9BE8F',@ret);
select @ret

----------Result-----------
Value of Return Code :: 0

Request Start time = 67058605
User-Name = 00606E71E0D6
User-Password = E8D9BE8F
Authorize-type = 0
Caller id is = E8D9BE8F
CALL
Proc_Ngi_IPTV_Authentication('00606E71E0D6','E8D9BE8F','0','E8D9BE8F',@ret);
select @ret
Return-code = 0
USERNAME & PASSWORD HAVE JUST MATCHED
REQUEST END TIME = 67058764
Process time = 159
modcall[authorize]: module "sql" returns ok for request 41885
rlm_pap: Found existing Auth-Type, not changing it.
modcall[authorize]: module "pap" returns noop for request 41885
modcall: leaving group authorize (returns ok) for request 41885
rad_check_password: Found Auth-Type PAP
rad_check_password: Auth-Type = Accept, accepting the user
Login OK: [00606E71E0D6/E8D9BE8F] (from client localhost port 0 cli
E8D9BE8F)
Processing the post-auth section of radiusd.conf
modcall: entering group post-auth for request 41885
radius_xlat:
'/usr/local/var/log/radius/radacct/10.2.4.17/auth-detail-20120516:18'
rlm_detail:
/usr/local/var/log/radius/radacct/%{Client-IP-Address}/auth-detail-%Y%m%d:%H
expands to
/usr/local/var/log/radius/radacct/10.2.4.17/auth-detail-20120516:18
modcall[post-auth]: module "reply_log" returns ok for request 41885
modcall: leaving group post-auth (returns ok) for request 41885
Sending Access-Accept of id 222 to 111.12.432.17 port 33
ReturnCode = 0
Finished request 41885
Going to the next request
Waking up in 1 seconds...
rad_recv: Access-Request packet from host 10.2.4.17:33033, id=222, length=92
Sending duplicate reply to client localhost:33033 - ID: 222
Re-sending Access-Accept of id 222 to 111.12.432.17 port 33
Waking up in 1 seconds...
--- Walking the entire request list ---
Waking up in 1 seconds...
--- Walking the entire request list ---
Cleaning up request 41856 ID 178 with timestamp 4fb4565e
Cleaning up request 41857 ID 179 with timestamp 4fb4565e
Cleaning up request 41858 ID 180 with timestamp 4fb4565e
Cleaning up request 41859 ID 181 with timestamp 4fb4565e
Cleaning up request 41860 ID 182 with timestamp 4fb4565e
Cleaning up request 41861 ID 183 with timestamp 4fb4565e
Cleaning up request 41862 ID 184 with timestamp 4fb4565e
Waking up in 1 seconds...
--- Walking the entire request list ---
Cleaning up request 41863 ID 185 with timestamp 4fb4565f
Cleaning up request 41864 ID 187 with timestamp 4fb4565f
Cleaning up request 41865 ID 193 with timestamp 4fb4565f
Cleaning up request 41866 ID 195 with timestamp 4fb4565f
Cleaning up request 41867 ID 196 with timestamp 4fb4565f
Cleaning up request 41868 ID 197 with timestamp 4fb4565f
Waking up in 1 seconds...
--- Walking the entire request list ---
Cleaning up request 41869 ID 198 with timestamp 4fb45660
Cleaning up request 41870 ID 199 with timestamp 4fb45660
Cleaning up request 41871 ID 200 with timestamp 4fb45660
Cleaning up request 41872 ID 202 with timestamp 4fb45660
Cleaning up request 41873 ID 203 with timestamp 4fb45660
Cleaning up request 41874 ID 204 with timestamp 4fb45660
Waking up in 1 seconds...
--- Walking the entire request list ---
Cleaning up request 41875 ID 206 with timestamp 4fb45661
Cleaning up request 41876 ID 207 with timestamp 4fb45661
Cleaning up request 41877 ID 212 with timestamp 4fb45661
Cleaning up request 41878 ID 214 with timestamp 4fb45661
Cleaning up request 41879 ID 215 with timestamp 4fb45661
Cleaning up request 41880 ID 216 with timestamp 4fb45661
Cleaning up request 41881 ID 218 with timestamp 4fb45661
Waking up in 1 seconds...
--- Walking the entire request list ---
Cleaning up request 41882 ID 219 with timestamp 4fb45662
Cleaning up request 41883 ID 220 with timestamp 4fb45662
Cleaning up request 41884 ID 221 with timestamp 4fb45662
Cleaning up request 41885 ID 222 with timestamp 4fb45662
Nothing to do. Sleeping until we see a request.
--- Walking the entire request list ---
Nothing to do. Sleeping until we see a request.
Exiting...
rlm_sql (sql): Closing sqlsocket 14
rlm_sql (sql): Closing sqlsocket 13
rlm_sql (sql): Closing sqlsocket 12
rlm_sql (sql): Closing sqlsocket 11
rlm_sql (sql): Closing sqlsocket 10
rlm_sql (sql): Closing sqlsocket 9
rlm_sql (sql): Closing sqlsocket 8
rlm_sql (sql): Closing sqlsocket 7
rlm_sql (sql): Closing sqlsocket 6
rlm_sql (sql): Closing sqlsocket 5
rlm_sql (sql): Closing sqlsocket 4
rlm_sql (sql): Closing sqlsocket 3
rlm_sql (sql): Closing sqlsocket 2
rlm_sql (sql): Closing sqlsocket 1
rlm_sql (sql): Closing sqlsocket 0
[ OK ]

--
View this message in context: http://freeradius.1045715.n5.nabble.com/Radius-Server-stuck-and-stop-responding-tp5713440p5713442.html
Sent from the FreeRadius - User mailing list archive at Nabble.com.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Alan Buxey
2012-05-31 06:32:00 UTC
Permalink
I see duplicate reply. Suggests that your sql is being too slow. Look at optimizing it eg indexes on the tabled you query....or migrate to a better DB.

If mysql then look at reducing number of connections to 10

alan
Awais
2012-05-31 10:17:30 UTC
Permalink
Is there any way to know how many requests are under process at that
time(When radius stop responding)???

--
View this message in context: http://freeradius.1045715.n5.nabble.com/Radius-Server-stuck-and-stop-responding-tp5713440p5713447.html
Sent from the FreeRadius - User mailing list archive at Nabble.com.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Fajar A. Nugraha
2012-05-31 10:30:43 UTC
Permalink
Post by Awais
Is there any way to know how many requests are under process at that
time(When radius stop responding)???
Not that I know of. If it starts complaining about "no server thread
available" (or something like that), then the problem is usually in
the slow backend. If there's no logs like that, then it might be
something else.

Did the logs you posted came from the running server when it stopped
responding? Cause it looked like debug mode output. And when in debug
mode, FR runs single threaded, so there's no point in asking "how many
requests are under process".

... and you STILL haven't said what version of FR you're using. It
matters. Really.

If the server is NOT normally running debug mode, AND you're using
latest version of FR, I'm not sure what else you can try. Perhaps
start using "strace" on the running server (when it stops responding)
to find out what it's doing.
--
Fajar
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Awais
2012-05-31 11:39:14 UTC
Permalink
RADIUS Version 2.0.1
Yes these are from running server when it stop responding, thn we restart
it...
i searched full log file and i din't find any thing like "no server thread
available"

--
View this message in context: http://freeradius.1045715.n5.nabble.com/Radius-Server-stuck-and-stop-responding-tp5713440p5713455.html
Sent from the FreeRadius - User mailing list archive at Nabble.com.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Alan DeKok
2012-05-31 11:51:47 UTC
Permalink
Post by Awais
RADIUS Version 2.0.1
Upgrade.

Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Awais
2012-05-31 11:57:04 UTC
Permalink
@Alan
But on or Voip server FreeRADIUS Version is 1.1.6.
And it is running perfectly. Not a single problem or restart since Dec 2011.

--
View this message in context: http://freeradius.1045715.n5.nabble.com/Radius-Server-stuck-and-stop-responding-tp5713440p5713458.html
Sent from the FreeRadius - User mailing list archive at Nabble.com.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Fajar A. Nugraha
2012-05-31 13:11:30 UTC
Permalink
Post by Awais
@Alan
But on or Voip server FreeRADIUS Version is 1.1.6.
And it is running perfectly. Not a single problem or restart since Dec 2011.
If you want to use old version of a software, that's fine, as long as
you know how to support it.

If you want help from users on this list, then the first thing you
should do is upgrade to latest version. There are lots of bugs fixed
in the latest version. You should even be able to install it easily
since RHEL5 (and clone) now has FR 2.1.12 as freeradius2 package.

If you DON'T want to upgrade, then my best advice is find someone else
who can help you. Cause it's unlikely anybody on this list will want
to waste their time looking at an old version of a software,
troubleshooting bugs which might already be fixed in the latest
version, just because you don't want to upgrade.
--
Fajar
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Alan DeKok
2012-05-31 13:12:12 UTC
Permalink
Post by Awais
@Alan
But on or Voip server FreeRADIUS Version is 1.1.6.
And it is running perfectly. Not a single problem or restart since Dec 2011.
Pick one topic and stick to it. You said you were running 2.0.1. And
you had problems with it.

So.. upgrade.

Saying "but I'm running 1.1.6 on another machine" is a terrible
response. It makes no sense.

Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Continue reading on narkive:
Loading...