Strange behavior on the Juniper MX240

Hi NANOG,

We are seeing some strange behavior on our Juniper MX240 Chassis it is randomly dropping the routes to the certain destination IP address getting the following errors on the MX240 Chassis

If Someone has seen these errors before please suggest how to resolve it

May 4 12:42:00 cr01 newsyslog[44735]: logfile turned over due to size>1024K
May 4 12:42:01 /kernel: RT_PFE: RT msg op 1 (PREFIX ADD) failed, err 5 (Invalid)
May 4 12:42:01 /kernel: RT_PFE: RT msg op 3 (PREFIX CHANGE) failed, err 5 (Invalid)
May 4 12:42:01 last message repeated 4 times
May 4 12:42:01 fpc0 RT: IPv6:0 - 2600:40fc:1011::/48 (add rt entry into jtree failed)
May 4 12:42:01 fpc0 RT-HAL,rt_entry_add_msg_proc,2028: rt_halp_vectors->rt_create failed
May 4 12:42:01 fpc0 RT-HAL,rt_entry_add_msg_proc,2092: proto ipv6,len 48 prefix 2600:40fc:1011::/48 nh 1048576
May 4 12:42:01 fpc0 RT-HAL,rt_msg_handler,540: route process failed
May 4 12:42:01 fpc0 RT: Failed prefix add IPv6 - 2001:67c:20fc::/48 (No memory) on FE 0
May 4 12:42:01 fpc0 RT: IPv6:0 - 2001:67c:20fc::/48 (add rt entry into jtree failed)
May 4 12:42:01 fpc0 RT-HAL,rt_entry_add_msg_proc,2028: rt_halp_vectors->rt_create failed
May 4 12:42:01 fpc0 RT-HAL,rt_entry_add_msg_proc,2092: proto ipv6,len 48 prefix 2001:67c:20fc::/48 nh 1048576
May 4 12:42:01 fpc0 RT-HAL,rt_msg_handler,540: route process failed
May 4 12:42:01 fpc0 RT: Failed prefix add IPv6 - 2606:2800:e004::/48 (No memory) on FE 0
May 4 12:42:01 fpc0 RT: Failed prefix add IPv6 - 2a05:3181:ffff::/48 (No memory) on FE 0
May 4 12:42:01 /kernel: RT_PFE: RT msg op 3 (PREFIX CHANGE) failed, err 5 (Invalid)
May 4 12:42:01 /kernel: RT_PFE: RT msg op 1 (PREFIX ADD) failed, err 5 (Invalid)
May 4 12:42:01 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
May 4 12:42:02 fpc0 RT: Failed prefix add IPv4 - 79.120.22/24 (No memory) on FE 0
May 4 12:42:02 fpc0 RT: IPv4:0 - 79.120.22/24 (add rt entry into jtree failed)
May 4 12:42:02 fpc0 RT-HAL,rt_entry_add_msg_proc,2028: rt_halp_vectors->rt_create failed
May 4 12:42:02 fpc0 RT-HAL,rt_entry_add_msg_proc,2092: proto ipv4,len 24 prefix 79.120.22/24 nh 1048583
May 4 12:42:02 /kernel: RT_PFE: RT msg op 1 (PREFIX ADD) failed, err 5 (Invalid)
May 4 12:42:02 fpc0 RT-HAL,rt_msg_handler,540: route process failed

May 4 09:33:17 fpc0 RSMON: Resource Category:jtree Instance:jtree2-seg0 Type:free-pages Available:20 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:17 fpc0 RSMON: Resource Category:jtree Instance:jtree2-seg0 Type:free-dwords Available:1280 is less than LWM limit:104857, rsmon_syslog_limit()
May 4 09:33:18 fpc0 RSMON: Resource Category:jtree Instance:jtree3-seg0 Type:free-pages Available:19 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:18 fpc0 RSMON: Resource Category:jtree Instance:jtree3-seg0 Type:free-dwords Available:1216 is less than LWM limit:104857, rsmon_syslog_limit()
May 4 09:33:18 fpc1 RSMON: Resource Category:jtree Instance:jtree0-seg0 Type:free-pages Available:16 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:18 fpc1 RSMON: Resource Category:jtree Instance:jtree0-seg0 Type:free-dwords Available:1024 is less than LWM limit:104857, rsmon_syslog_limit()
May 4 09:33:18 fpc1 RSMON: Resource Category:jtree Instance:jtree1-seg0 Type:free-pages Available:15 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:18 fpc1 RSMON: Resource Category:jtree Instance:jtree1-seg0 Type:free-dwords Available:960 is less than LWM limit:104857, rsmon_syslog_limit()
May 4 09:33:18 fpc1 RSMON: Resource Category:jtree Instance:jtree2-seg0 Type:free-pages Available:19 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:19 fpc1 RSMON: Resource Category:jtree Instance:jtree2-seg0 Type:free-dwords Available:1216 is less than LWM limit:104857, rsmon_syslog_limit()
May 4 09:33:19 fpc1 RSMON: Resource Category:jtree Instance:jtree3-seg0 Type:free-pages Available:17 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:19 fpc1 RSMON: Resource Category:jtree Instance:jtree3-seg0 Type:free-dwords Available:1088 is less than LWM limit:104857, rsmon_syslog_limit()
May 4 09:33:19 fpc2 RSMON: Resource Category:jtree Instance:jtree0-seg0 Type:free-pages Available:15 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:19 fpc2 RSMON: Resource Category:jtree Instance:jtree0-seg0 Type:free-dwords Available:960 is less than LWM limit:104857, rsmon_syslog_limit()
May 4 09:33:19 fpc2 RSMON: Resource Category:jtree Instance:jtree1-seg0 Type:free-pages Available:15 is less than LWM limit:1638, rsmon_syslog_limit()
May 4 09:33:19 fpc2 RSMON: Resource Category:jtree Instance:jtree1-seg0 Type:free-dwords Available:960 is less than LWM limit:104857, rsmon_syslog_limit()

Any suggestions will be helpful

Please do let me know if you have any questions.

Regards and thanks,
Nehul

'show route summary'
'start shell pfe network fpcX'
'show jnh N pool summary'
'show jnh N pool usage'

Actually is this DPCE? 'show jtree N summary'

What JUNOS version are you running?

Regards
Paschal Masha | Engineering
Skype ID: paschal.masha

‘show chassis fpc’ might also be useful (or, at least easier :-))

W

Thank you Saku and the warren Here is the requested output

show route summary

inet.0: 879635 destinations, 879649 routes (879634 active, 0 holddown, 1 hidden)
Direct: 9 routes, 8 active
Local: 8 routes, 8 active
OSPF: 928 routes, 925 active
BGP: 878686 routes, 878678 active
Static: 2 routes, 2 active
Aggregate: 15 routes, 12 active

inet.3: 718 destinations, 718 routes (718 active, 0 holddown, 0 hidden)
LDP: 718 routes, 718 active

Test_VRF.inet.0: 1 destinations, 1 routes (1 active, 0 holddown, 0 hidden)
Local: 1 routes, 1 active

mpls.0: 390 destinations, 390 routes (390 active, 0 holddown, 0 hidden)
MPLS: 3 routes, 3 active
LDP: 387 routes, 387 active

inet6.0: 143065 destinations, 286099 routes (143065 active, 0 holddown, 0 hidden)
Direct: 13 routes, 9 active
Local: 10 routes, 10 active
OSPF3: 16 routes, 15 active
BGP: 286060 routes, 143030 active
Static: 1 routes, 1 active

show chassis fpc
Temp CPU Utilization (%) Memory Utilization (%)
Slot State (C) Total Interrupt DRAM (MB) Heap Buffer
0 Online 31 11 0 1024 37 29
1 Online 31 11 0 1024 45 29
2 Online 30 4 0 1024 36 29

request pfe execute target fpc0 command “show jtree 0 memory extensive”
SENT: Ukern command: show jtree 0 memory extensive
GOT:
GOT: Jtree memory segment 0 (Context: 0x44817bb0)
GOT: -------------------------------------------
GOT: Memory Statistics:
GOT: 16777216 bytes total
GOT: 16715144 bytes used
GOT: 56880 bytes available (7168 bytes from free pages)
GOT: 3024 bytes wasted
GOT: 2168 bytes unusable
GOT: 32768 pages total
GOT: 32519 pages used (2568 pages used in page alloc)
GOT: 235 pages partially used
GOT: 14 pages free (max contiguous = 6)
GOT:
GOT: Partially Filled Pages (In bytes):-
GOT: Unit Avail Overhead
GOT: 8 26256 0
GOT: 16 14320 0
GOT: 24 8352 2040
GOT: 32 352 0
GOT: 48 432 128
GOT:
GOT: Free Page Lists(Pg Size = 512 bytes):-
GOT: Page Bucket Avail(Bytes)
GOT: 1-1 2560
GOT: 3-3 1536
GOT: 6-6 3072
GOT:
GOT: Fragmentation Index = 0.946, (largest free = 3072)
GOT: Counters:
GOT: 2643777 allocs (0 failed)
GOT: 0 releases(partial 0)
GOT: 1095040 frees
GOT: 0 holds
GOT: 7 pending frees(pending bytes 56)
GOT: 0 pending forced
GOT: 0 times free blocked
GOT: 0 sync writes
GOT: Error Counters:-
GOT: 0 bad params
GOT: 0 failed frees
GOT: 0 bad cookie
GOT:
GOT: Jtree memory segment 1 (Context: 0x448997f0)
GOT: -------------------------------------------
GOT: Memory Statistics:
GOT: 16777216 bytes total
GOT: 4589552 bytes used
GOT: 12185384 bytes available (12183552 bytes from free pages)
GOT: 2248 bytes wasted
GOT: 32 bytes unusable
GOT: 32768 pages total
GOT: 8967 pages used (8967 pages used in page alloc)
GOT: 5 pages partially used
GOT: 23796 pages free (max contiguous = 23793)
GOT:
GOT: Partially Filled Pages (In bytes):-
GOT: Unit Avail Overhead
GOT: 8 1416 0
GOT: 16 80 0
GOT: 48 336 32

request pfe execute target fpc1 command “show jtree 0 memory extensive”
SENT: Ukern command: show jtree 0 memory extensive
GOT:
GOT: Jtree memory segment 0 (Context: 0x447cc698)
GOT: -------------------------------------------
GOT: Memory Statistics:
GOT: 16777216 bytes total
GOT: 16715840 bytes used
GOT: 56184 bytes available (8192 bytes from free pages)
GOT: 3024 bytes wasted
GOT: 2168 bytes unusable
GOT: 32768 pages total
GOT: 32533 pages used (2568 pages used in page alloc)
GOT: 219 pages partially used
GOT: 16 pages free (max contiguous = 5)
GOT:
GOT: Partially Filled Pages (In bytes):-
GOT: Unit Avail Overhead
GOT: 8 25544 0
GOT: 16 13312 0
GOT: 24 8352 2040
GOT: 32 352 0
GOT: 48 432 128
GOT:
GOT: Free Page Lists(Pg Size = 512 bytes):-
GOT: Page Bucket Avail(Bytes)
GOT: 1-1 2048
GOT: 2-2 1024
GOT: 5-5 5120
GOT:
GOT: Fragmentation Index = 0.954, (largest free = 2560)
GOT: Counters:
GOT: 2645725 allocs (0 failed)
GOT: 2 releases(partial 0)
GOT: 1096891 frees
GOT: 0 holds
GOT: 0 pending frees(pending bytes 0)
GOT: 0 pending forced
GOT: 0 times free blocked
GOT: 0 sync writes
GOT: Error Counters:-
GOT: 0 bad params
GOT: 0 failed frees
GOT: 0 bad cookie
GOT:
GOT: Jtree memory segment 1 (Context: 0x4484e2d8)
GOT: -------------------------------------------
GOT: Memory Statistics:
GOT: 16777216 bytes total
GOT: 4589504 bytes used
GOT: 12185432 bytes available (12184576 bytes from free pages)
GOT: 2248 bytes wasted
GOT: 32 bytes unusable
GOT: 32768 pages total
GOT: 8967 pages used (8967 pages used in page alloc)
GOT: 3 pages partially used
GOT: 23798 pages free (max contiguous = 23798)
GOT:
GOT: Partially Filled Pages (In bytes):-
GOT: Unit Avail Overhead
GOT: 8 424 0
GOT: 16 96 0
GOT: 48 336 32
GOT:
GOT: Free Page Lists(Pg Size = 512 bytes):-
GOT: Page Bucket Avail(Bytes)
GOT: 27-32768 12184576
GOT:
GOT: Fragmentation Index = 0.000, (largest free = 12184576)
GOT: Counters:
GOT: 45 allocs (0 failed)
GOT: 0 releases(partial 0)
GOT: 0 frees
GOT: 0 holds

JUNOS Version

JUNOS Base OS boot [10.4R9.2]
JUNOS Base OS Software Suite [10.4R9.2]
JUNOS Kernel Software Suite [10.4R9.2]
JUNOS Crypto Software Suite [10.4R9.2]
JUNOS Packet Forwarding Engine Support (M/T Common) [10.4R9.2]
JUNOS Packet Forwarding Engine Support (MX Common) [10.4R9.2]
JUNOS Online Documentation [10.4R9.2]
JUNOS Voice Services Container package [10.4R9.2]
JUNOS Border Gateway Function package [10.4R9.2]
JUNOS Services AACL Container package [10.4R9.2]
JUNOS Services LL-PDF Container package [10.4R9.2]
JUNOS Services PTSP Container package [10.4R9.2]
JUNOS Services Stateful Firewall [10.4R9.2]
JUNOS Services NAT [10.4R9.2]
JUNOS Services Application Level Gateways [10.4R9.2]

Dude, JunOS 10.4 end of support - 06/08/2014. You have an almost 8 years past end of Vendor support O/S still in production! No, just no.

Hi,

Dude, JunOS 10.4 end of support - 06/08/2014. You have an almost 8 years past
end of Vendor support O/S still in production! No, just no.

Now I'm really interested in the uptime of that box...

Thanks,

Sabri

Your line cards (not RE's) are running out of route-storage memory.
As a short-term mitigation, you could try borrowing from segment 1,
normally dedicated to filters,

set chassis memory-enhanced route

but this option may not exist in the version of JunOS you're
running, which as already mentioned is very old.

If the command is accepted, and it lets you commit, you'll then
need to restart each of the FPC's, one at a time, by slot number,
which will take each out of service for a few minutes, so you
probably want to wait until a scheduled maintenance period, and
start with less-important FPC slots first:

request chassis fpc restart slot X

Ok, thank you all for the feedback we are going to start with the Junos OS upgrade first on it but have to open the ticket with JTAC since currently on the juniper support website they have the Junos 15.1 is available so not sure we can directly jump from 10.4 to 15.1 maybe we have to do step by step upgrade on it. Any other suggestions will be helpful as well

By the way, the uptime on the Juniper MX chassis was 1589 Days on it.

Try setting “keep-none” on your BGP neighbor (s) not sure if it’ll still need the cards rebooted equally you can also just accept a default route or wait for TAC to take over :slight_smile:

Regards
Paschal Masha | Engineering
Skype ID: paschal.masha

Almost always direct upgrade works. If you ask TAC, they will likely
suggest a formal process and you'll be doing many upgrades, which
itself isn't actually something that is guaranteed to work (like in
WRL9 case, but that is vmhost RE, not yours).

And like Jordan said, you are out of resources but can extend them
with the command given, which should give you more run rate. You may
want to look in more detail how long you can keep running DPCE until
you're really out.

Ok thank you Harold we currently have the multihomed setup so we are still planning to address it in the bestway

For the Junos OS we can go 10.4 to 15.1 directly from the USB installer will be helpful also is their any tools available we can validate the config before upgrading it

Looks like you are out of FIB slots.

Would recommend reducing the number of routes you need to send into FIB, or upgrading to newer hardware that has more space.

Mark.

This feature was introduced in 10.4, so he should have it.

And yes, it's only supported for DPC's (I-chip).

Mark.

Curious, what RE are you running?

If you have DPC's still, I'd assume something like the RE-S-1300 or RE-S-2000, but not sure.

I ask because I'm not how late the older RE's can go.

Mark.

Certainly an option, but this requires quite a bit of babysitting, because as the DFZ oscillates, you can run into issues that send you into circles, largely unaware about FIB issues, especially when just a subset of routes are affected.

So yes, definitely an option, but the OP will need to watch the line cards like a hawk, and be ultra sensitive to debugging regular issues vs. FIB-related issues.

Mark.

Ok got it saku we will got with the direct upgrade of it

Ok Mark thank you for the suggestion we are currently running on the RE-S-1300 i am trying to check if juniper docs as the list of the DPCE with the I-Chip but not able to find it yet if you know DPCE model number let me know for it