article monitor failed: software caused connection abort: recv failed

This error message indicates a general network connectivity error between the monitoring station and the monitored agent servers.  It has only been seen on Windows monitoring stations and is generally resolved using one of the following methods:

 

Disable TCP Offload Engine (TOE)

 

If your monitoring station server has a TOE-enabled network card, you may need to disable it.  TOE is intended to accelerate long-running TCP connections.  Since up.time uses many short-lived connections to contact agents, its process can be mismanaged by the TOE.  Please perform the following commands on the monitoring station to disable TOE:

  • Go to Control Panel > Network Connections.
  • Right click on the active NIC card and select Properties.
  • Click the Configure button and select the Advanced tab.
  • Locate TCP/IP Offload in the list and set it to Disabled.
  • If you don't see the 'TCP/IP Offload' Property in that list another common name for it is 'Large Send Offload (IPv4)' 

If the TCP/IP Offload setting is not found by using the previous steps, you may need to disable the TCP Offload setting within another program or Windows service with an advanced network card configuration.  For example:

HP Network Configuration Utility

  • Click Advanced.
  • Set TCP Offload Engine (TOE) to disabled.

Broadcom Advanced Control Suite

  • Click on the Primary Adapter.
  • Click on the Resource Allocations tab on the right.
  • If TOE is enabled, click the Configure button and disable it.

Windows TCP Stack is overloaded

 

If up.time is monitoring more than 500 services, you may find that the default Windows TCP stack options are not sufficient to maintain the outbound up.time connections.  In this case, we recommend adjusting registry settings on the monitoring station to relieve some common TCP bottlenecks:

 

Under the :HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters registry key, create the following new keys:

  • Value Name: MaxUserPort
  • Value Type: REG_DWORD
  • Value data: 65534
  • Valid Range: 5000-65534 (decimal)
  • Default: 0x1388 (5000 decimal)

Description: this parameter controls the maximum port number that is used when a program requests an available user port from the system.  Typically, ephemeral (short-lived) ports are allocated between the values of 1024 and 5000 inclusive.

  • Value Name: TcpTimedWaitDelay
  • Value Type: REG_DWORD—time in seconds
  • Value data: 30
  • Valid Range: 30-300 (decimal)
  • Default: 0xF0 (240 decimal)

Description: this parameter determines the length of time that a connection stays in the TIME_WAIT state when being closed.  While a connection is in the TIME_WAIT state, the socket pair cannot be re-used.  This is also known as the 2MSL state because the value should be twice the maximum segment lifetime on the network.  See RFC 793 for further details.

Related Articles


Windows Event Log Monitor returns Output: Monitor failed: For inp...

RatingViews
article

This error indicates that the target agent server's up.time agent is incompatible with your monitoring station version. Please upgrade the agent on the impacted server and verify that this error is...

By: uptime Support | Date Created: 8-26-2010 | Last Modified: 8-10-2011 | Index: 497

  3044

/var/adm/messages kernel: cdrom: open failed

RatingViews
article

On some agent server configurations these types of log messages may appear whenever the up.time monitoring station attempts to contact the agent Jun 25 19:19:20 server1 kernel: cdrom: open...

By: uptime Support | Date Created: 12-31-1969 | Last Modified: 8-31-2011 | Index: 307

  4776

Error while pinging: Failed to create raw socket

RatingViews
article

This error is commonly caused by permission problems on the icmp executable that up.time uses to verify the PING status for a monitored element. To resolve this issue review the information below...

By: uptime Support | Date Created: 12-31-1969 | Last Modified: 8-31-2011 | Index: 279

  6730

Still receiving alerts when topological dependency has failed

RatingViews
article

When the host check for a topological host enters a failure state (i.e. WARN or CRIT), the service monitors for hosts that depend on that element will have their alerts / actions suppressed and...

By: uptime Support | Date Created: 7-7-2011 | Last Modified: 8-11-2011 | Index: 533

  1895

Failed to Get Configuration message from Linux Agent

RatingViews
article

This knowledge base article provides a workaround for the most common cause of the Failed to get configuration from agent error with Linux agents, which occurs when there is unexpected or...

By: uptime Support | Date Created: 3-7-2013 | Last Modified: 3-7-2013 | Index: 591

  1682

User Comments



No comments have been posted.

Copyright © 2021 IDERA, Inc.   Legal   Privacy Statement