<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>Jon Lewis's Blog   </title>
    <link>http://jonsblog.lewis.org</link>
    <description>Jon Lewis's weblog.</description>
    <language>en</language>

  <item>
    <title>Black Hole Routing</title>
    <link>http://jonsblog.lewis.org/2011/02/05#blackhole</link>
    <description>&lt;p&gt;A number of Tier-1/Tier-2 network service providers support a feature
called real time black hole routing triggered via BGP.  In simple terms, what
this means is that with providers that support this, you can advertise a route
to your transit provider(s) that tells them &quot;I'd like you to null route this
instead of routing it to me.&quot;  Why would this be useful?  The most likely
situation is an IP on your network is being DDoS'd (Distributed Denial of
Service attack) hard enough that it's congesting your transit pipe(s)
causing increased latency and/or packet loss for all of your internet traffic.&lt;/p&gt;

&lt;p&gt;The usual way to do this (or express any other sort of desired upstream routing
policy to your transit provider) is via BGP communities.  These are set in
the output route-map for your eBGP peering with the provider.  The following
is an example of how you might setup a system to allow for easy
creation/removal of real time black hole routes (on cisco gear).&lt;/p&gt; 

&lt;p&gt;First:&lt;OL&gt;&lt;li&gt;Make sure your provider(s) support this.
&lt;li&gt;Look up what community strings they use for this.
&lt;li&gt;Step three, you'll probably need to contact each provider and make sure
they're setup to receive /32 IPv4 routes from you.  Assuming they do prefix
filtering, they may not automatically be setup to accept such specific
routes from you.
&lt;/OL&gt;
&lt;p&gt;Using Level3 as the example, a search for Level3 BGP communities will
turn up that 3356:9999 is Level3's customer accepted community for telling
Level3 to discard traffic for the tagged route.&lt;/p&gt;

&lt;p&gt;Now, you could edit your Level3 output route-map, BGP config (insert a
network statement), and then do a static route every time you want to create
a real time black hole route, but doing it that way is time consuming and
error-prone and you may not want everyone who has enable access mucking
around in your eBGP config.  Instead, why not set things up so all you 
have to do is create a special static route anywhere on your network?&lt;/p&gt;

&lt;p&gt;This config assumes you have separate routers for transit connections and for
internal routing, there are config changes that will need to be done on
each.&lt;/p&gt; 

&lt;p&gt;On the transit router that talks to Level3:
&lt;pre&gt;
ip community-list standard BLACKHOLE permit &amp;lt;your-ASN&amp;gt;:9999
!
route-map LEVEL3-OUTPUT permit 5
 match community BLACKHOLE
 set community 3356:9999
&lt;/pre&gt;
This assumes that LEVEL3-OUTPUT is already configured as your output
route-map for your Level3 peering session.&lt;/p&gt;

&lt;p&gt;On your internal routing router(s):

&lt;pre&gt;
ip access-list extended match32
 permit ip any host 255.255.255.255
!
route-map blackhole permit 10
 match ip address match32
 match tag 9999
 set community &amp;lt;your-ASN&amp;gt;:9999
!
router bgp &lt;your-asn&gt;
 redistribute static route-map blackhole
 redistribute ospf 1 route-map blackhole
&lt;/pre&gt;
&lt;/p&gt;

&lt;p&gt;If you have much experience with BGP, you're hopefully saying to yourself
&quot;but redistribution, especially of the IGP, into BGP is really dangerous&quot;. 
Well, the blackhole route-map is limiting the redistribution to only those
routes tagged with 9999 and which are /32s.  This means if someone gets
clumsy and creates a route for a shorter network with the tag 9999, the
route-map will not match that route, and it won't be redistributed into
BGP.  So this setup won't let you accidentally real time blackhole an entire
CIDR block.  The reason for redistributing both OSPF (or you might use ISIS) and
static is, this way the route can be created on this router (as a static
route) or on another device participating in your IGP.&lt;/p&gt;

&lt;p&gt;Once these config changes have been made, all you need to do to real time
black hole route an IP is log into the internal router or any router in your
IGP and
&lt;pre&gt;ip route &amp;lt;IP to black hole&amp;gt; 255.255.255.255 null0 tag 9999&lt;/pre&gt;
This will null route the IP in your network and tell your transit providers
to stop sending traffic for the IP to you.&lt;/p&gt;

&lt;p&gt;A really neat side effect of this setup is, you can real time black hole 
an IP without null routing it internally.&lt;/p&gt;

&lt;p&gt;Suppose the IP you want to real time black hole is part of a customer's
/28, and that /28 is configured as the IP on the customer's access port. 
i.e.

&lt;pre&gt;
interface FastEthernet0/1
 ip address &amp;lt;customer IP network&amp;gt; 255.255.255.240
&lt;/pre&gt;

&lt;p&gt;You can log into that device, and &lt;/p&gt;

&lt;pre&gt;
ip route &amp;lt;IP to black hole&amp;gt; 255.255.255.255 FastEthernet0/1 tag 9999
&lt;/pre&gt;

&lt;p&gt;Now, the IP is still routed to the customer, but because it's tagged with
9999, assuming your customer aggregation routers redistribute static into 
your IGP (which for me is OSPF), the route will be in your
IGP with the tag, your internal router will see this and redistribute it
into BGP with the internally used real time black hole community, and your
transit router(s) will tag the route with the appropriate community to have your
transit provider(s) real time black hole route it.  The IP is still 
reachable inside your ASN, but to the rest of the internet, it's dead as
your transit providers are null routing it.&lt;/p&gt;&lt;/p&gt;

</description>
  </item>
  <item>
    <title>One Million Routes</title>
    <link>http://jonsblog.lewis.org/2008/02/09#sup720-20080209</link>
    <description>&lt;p&gt;So, you just upgraded your cisco 6500/7600 gear to Sup720-3BXL's because
that's the lowest end supervisor module that has the tcam for full internet
routes (&gt;244k routes).  You may have read in the &lt;a
href=&quot;http://cio.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/product_data_sheet09186a0080159856_ps2797_Products_Data_Sheet.html&quot;&gt;data
sheet&lt;/a&gt; that it's capable of &quot;1,000,000 IPv4 routes; 500,000 IPv6 routes.&quot;
That should be plenty of room for growth, right?  Well, maybe not as much as
you think.&lt;/p&gt;  

&lt;p&gt;Somewhere burried in the fine print (ok, I can't actually find
it even in fine print or an * or anywhere on the data sheet), is the fact
that it's an either or thing.  i.e.  The 3BXL can do 1,000,000 IPv4 routes
(and no IPv6 at all), or it can do 500,000 IPv6 routes (and no IPv4 at all). 
In a real world installation, neither of those configs are terribly useful. 
The default settings allow for 524,288 IPv4 routes &lt;b&gt;and&lt;/b&gt; 262,144 IPv6
routes...meaning in its default config, with full internet routes a 
Sup720-3BXL is already at nearly half it's capacity of IPv4 routes.  You can
examine this (using recent IOS versions), with the command:
&lt;br&gt;
&lt;i&gt;show platform hardware capacity&lt;/i&gt;
&lt;/p&gt;
Look for the output section labeled &quot;L3 Forwarding Resources&quot;.  i.e.
&lt;pre&gt;
L3 Forwarding Resources
             FIB TCAM usage:                     Total        Used	%Used
                  72 bits (IPv4, MPLS, EoM)     524288      230589	  44%
                 144 bits (IP mcast, IPv6)      262144           5	   1%
&lt;/pre&gt;
This can be tuned with the config command &lt;i&gt;mls cef maximum-routes ip
&amp;lt;N&amp;gt;&lt;/i&gt; where N is a number in thousands of IPv4 routes you want to 
be able to handle.  i.e. With &quot;mls cef maximum-routes ip 750&quot;, the above
output changes to:
&lt;pre&gt;
L3 Forwarding Resources
             FIB TCAM usage:                     Total        Used       %Used
                  72 bits (IPv4, MPLS, EoM)     770048      230459         30%
                 144 bits (IP mcast, IPv6)      139264           5          1%
&lt;/pre&gt;
&lt;p&gt;Such a split may make more sense, as it leaves more room for anticipated 
IPv4 routing table growth, and in a perfect world, we really shouldn't see
much more than a single IPv6 prefix per ASN.&lt;/p&gt;

&lt;p&gt;Note: The numbers above from a set of Sup720-3BXL's in a lab environment
have slightly filtered BGP feeds.  &quot;Full routes&quot; would be closer to 240,000
routes.&lt;/p&gt;

</description>
  </item>
  <item>
    <title>RIR Minimums BGP prefix-list</title>
    <link>http://jonsblog.lewis.org/2008/01/19#bgp</link>
    <description>&lt;p&gt;I originally posted this BGP filter to a couple of mailing lists, most
notably the NANOG list, back in September 2007.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.merit.edu/mail.archives/nanog/2007-09/msg00103.html&quot;&gt;http://www.merit.edu/mail.archives/nanog/2007-09/msg00103.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The reason I put this filter together is lots of big cisco routers, in
particular the 6500/7600 series with anything less than the Sup720-3bxl,
were on the verge of running out of space (TCAM in the 6500/7600 case) to
hold routes due to continued growth of the global BGP routing table.  A
large part of this global routing table &quot;growth&quot; is actually gratuitous
deaggregation by networks that either don't care or don't even realize what
they're doing.  Most networks can live without these &quot;garbage routes&quot;, and
since I maintain a couple of 6500/Sup2 routers, I started working on
contingency plans in case we were unable to upgrade to Sup720-3bxls before
the global routing table + our internal routes hit the magic number of
routes (244k) at which point the Sup2 starts doing &quot;bad things&quot;.&lt;/p&gt;

&lt;p&gt;It should be noted that because some of the really clue deficient networks
announce only the deaggregates of their CIDRs, using this filter may cause
you to entirely lose routing information to such networks.  Therefore,
unless you're able to get away with that level of BOFHness (&quot;fix your BGP if
you want to talk to us&quot;), I strongly suggest you add (if you don't already
have) one or more default routes to your various transit providers.&lt;/p&gt;

&lt;p&gt;This BGP route filter is based largely on Barry Greene's work available
from&lt;/p&gt;

&lt;p&gt;&lt;a
href=&quot;ftp://ftp-eng.cisco.com/cons/isp/security/Ingress-Prefix-Filter-Templates/T-ip-prefix-filter-ingress-strict-check-v18.txt&quot;&gt;
ftp://ftp-eng.cisco.com/cons/isp/security/Ingress-Prefix-Filter-Templates/T-ip-prefix-filter-ingress-strict-check-v18.txt&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While working on my version of ISP-Ingress-In-Strict, I noticed a bunch of
inconsistencies in the expected RIR minimum allocations in Barry's
ISP-Ingress-In-Strict and in the data actually published by the various
RIRs.&lt;/p&gt;

&lt;p&gt;I've adjusted the appropriate entries, flipped things around so that for
each of the known RIR /8 or shorter prefixes, prefixes longer than RIR
specified minimums (or /24 in cases where the RIR specifies longer than /24!)
are denied.&lt;/p&gt;

&lt;p&gt;At the end of the prefix-list, any prefix /24 or shorter is allowed.  The
advantage to this setup is known ranges are filtered on known RIR minimums. 
Anything omitted ends up being permitted as long as it's /24 or shorter.&lt;/p&gt;

&lt;p&gt;If you currently use a distribute-list to filter incoming routes, you'll
have to rewrite those rules in prefix-list format and merge them into the
beginning of this prefix-list, as IOS (at least the versions I'm using)
doesn't allow both an input prefix-list and input distribute-list on the
same BGP peer.&lt;/p&gt;

&lt;p&gt;What follows is the latest version of what I originally posted to the NANOG
list in September 2007.
&lt;/p&gt;
&lt;pre&gt;
-- jlewis lewis.org 20080118

&lt;p&gt;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! APNIC  http://www.apnic.net/db/min-alloc.html !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
ip prefix-list ISP-Ingress-In-Strict SEQ 4000 deny 58.0.0.0/8 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 4001 deny 59.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 4002 deny 60.0.0.0/7 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 4004 deny 116.0.0.0/6 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 4008 deny 120.0.0.0/6 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 4011 deny 124.0.0.0/7 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 4013 deny 126.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 4014 deny 202.0.0.0/7 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 4016 deny 210.0.0.0/7 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 4018 permit 218.100.0.0/16 ge 17 le 24
ip prefix-list ISP-Ingress-In-Strict SEQ 4019 deny 218.0.0.0/7 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 4021 deny 220.0.0.0/7 ge 21
ip prefix-list ISP-Ingress-In-Strict seq 4023 deny 222.0.0.0/8 ge 21
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! http://www.arin.net/reference/ip_blocks.html#ipv4    !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
ip prefix-list ISP-Ingress-In-Strict SEQ 5000 deny 24.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 5001 deny 63.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 5002 deny 64.0.0.0/5 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 5010 deny 72.0.0.0/6 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 5014 deny 76.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 5015 deny 96.0.0.0/6 ge 21
! these ge 25's are redundant, but left in for accounting purposes
ip prefix-list ISP-Ingress-In-Strict SEQ 5020 deny 198.0.0.0/7 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 5022 deny 204.0.0.0/7 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 5023 deny 206.0.0.0/7 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 5032 deny 208.0.0.0/8 ge 23
ip prefix-list ISP-Ingress-In-Strict SEQ 5033 deny 209.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 5034 deny 216.0.0.0/8 ge 21
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! RIPE NCC  https://www.ripe.net/ripe/docs/ripe-ncc-managed-address-space.html !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
ip prefix-list ISP-Ingress-In-Strict SEQ 6000 deny 62.0.0.0/8 ge 20
ip prefix-list ISP-Ingress-In-Strict SEQ 6001 deny 77.0.0.0/8 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 6002 deny 78.0.0.0/7 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 6004 deny 80.0.0.0/7 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 6006 deny 82.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 6007 deny 83.0.0.0/8 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 6008 deny 84.0.0.0/6 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 6012 deny 88.0.0.0/7 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 6014 deny 90.0.0.0/8 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 6015 deny 91.0.0.0/8 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 6016 deny 92.0.0.0/6 ge 22
ip prefix-list ISP-Ingress-In-Strict SEQ 6020 deny 193.0.0.0/8 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 6021 deny 194.0.0.0/7 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 6023 deny 212.0.0.0/7 ge 20
ip prefix-list ISP-Ingress-In-Strict SEQ 6025 deny 217.0.0.0/8 ge 21
!
!
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! LANIC  - http://lacnic.net/en/registro/index.html
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
ip prefix-list ISP-Ingress-In-Strict SEQ 7000 deny 189.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 7001 deny 190.0.0.0/8 ge 21
ip prefix-list ISP-Ingress-In-Strict SEQ 7002 deny 200.0.0.0/8 ge 25
ip prefix-list ISP-Ingress-In-Strict SEQ 7003 deny 201.0.0.0/8 ge 21
!
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! AFRINIC  http://www.afrinic.net/index.htm                         !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
ip prefix-list ISP-Ingress-In-Strict SEQ 8000 deny 41.0.0.0/8 ge 23
ip prefix-list ISP-Ingress-In-Strict SEQ 8001 deny 196.0.0.0/8 ge 23
!
! Final &quot;permit any any&quot; statement.
! This is allowing all the orginal pre-RIR/RFC2050 allocations through.
! Addtional filtering can be added if so desired.
!
!ip prefix-list ISP-Ingress-In-Strict seq 10100 deny 0.0.0.0/0 le 7
ip prefix-list ISP-Ingress-In-Strict seq 10200 permit 0.0.0.0/0 le 24&lt;/p&gt;

&lt;p&gt;&lt;/pre&gt;&lt;/p&gt;

</description>
  </item>
  </channel>
</rss>
