Linux Networking and Network Devices APIs¶
Linux Networking¶
Networking Base Types¶
- enum
sock_type¶ Socket types
Constants
SOCK_STREAM- stream (connection) socket
SOCK_DGRAM- datagram (conn.less) socket
SOCK_RAW- raw socket
SOCK_RDM- reliably-delivered message
SOCK_SEQPACKET- sequential packet socket
SOCK_DCCP- Datagram Congestion Control Protocol socket
SOCK_PACKET- linux specific way of getting packets at the dev level.For writing rarp and other similar things on the user level.
Description
When adding some new socket type pleasegrep ARCH_HAS_SOCKET_TYPE include/asm-* /socket.h, at least MIPSoverrides this enum for binary compat reasons.
- enum
sock_shutdown_cmd¶ Shutdown types
Constants
SHUT_RD- shutdown receptions
SHUT_WR- shutdown transmissions
SHUT_RDWR- shutdown receptions/transmissions
- struct
socket¶ general BSD socket
Definition
struct socket { socket_state state; short type; unsigned long flags; struct file *file; struct sock *sk; const struct proto_ops *ops; struct socket_wq wq;};Members
state- socket state (
SS_CONNECTED, etc) type- socket type (
SOCK_STREAM, etc) flags- socket flags (
SOCK_NOSPACE, etc) file- File back pointer for gc
sk- internal networking protocol agnostic socket representation
ops- protocol specific socket operations
wq- wait queue for several uses
Socket Buffer Functions¶
- unsigned int
skb_frag_size(const skb_frag_t * frag)¶ Returns the size of a skb fragment
Parameters
constskb_frag_t*frag- skb fragment
- void
skb_frag_size_set(skb_frag_t * frag, unsigned int size)¶ Sets the size of a skb fragment
Parameters
skb_frag_t*frag- skb fragment
unsignedintsize- size of fragment
- void
skb_frag_size_add(skb_frag_t * frag, int delta)¶ Increments the size of a skb fragment bydelta
Parameters
skb_frag_t*frag- skb fragment
intdelta- value to add
- void
skb_frag_size_sub(skb_frag_t * frag, int delta)¶ Decrements the size of a skb fragment bydelta
Parameters
skb_frag_t*frag- skb fragment
intdelta- value to subtract
- bool
skb_frag_must_loop(struct page * p)¶ Test if
pis a high memory page
Parameters
structpage*p- fragment’s page
skb_frag_foreach_page(f,f_off,f_len,p,p_off,p_len,copied)¶loop over pages in a fragment
Parameters
f- skb frag to operate on
f_off- offset from start of f->bv_page
f_len- length from f_off to loop over
p- (temp var) current page
p_off- (temp var) offset from start of current page,non-zero only on first page.
p_len- (temp var) length in current page,< PAGE_SIZE only on first and last page.
copied(temp var) length so far, excluding current p_len.
A fragment can hold a compound page, in which case per-pageoperations, notably kmap_atomic, must be called for eachregular page.
- struct
skb_shared_hwtstamps¶ hardware time stamps
Definition
struct skb_shared_hwtstamps { ktime_t hwtstamp;};Members
hwtstamp- hardware time stamp transformed into durationsince arbitrary point in time
Description
Software time stamps generated byktime_get_real() are stored inskb->tstamp.
hwtstamps can only be compared against other hwtstamps fromthe same device.
This structure is attached to packets as part of theskb_shared_info. Use skb_hwtstamps() to get a pointer.
- struct
sk_buff¶ socket buffer
Definition
struct sk_buff { union { struct { struct sk_buff *next; struct sk_buff *prev; union { struct net_device *dev; unsigned long dev_scratch; }; }; struct rb_node rbnode; struct list_head list; }; union { struct sock *sk; int ip_defrag_offset; }; union { ktime_t tstamp; u64 skb_mstamp_ns; }; char cb[48] ; union { struct { unsigned long _skb_refdst; void (*destructor)(struct sk_buff *skb); }; struct list_head tcp_tsorted_anchor; };#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE); unsigned long _nfct;#endif; unsigned int len, data_len; __u16 mac_len, hdr_len; __u16 queue_mapping;#ifdef __BIG_ENDIAN_BITFIELD;#define CLONED_MASK (1 << 7);#else;#define CLONED_MASK 1;#endif;#define CLONED_OFFSET() offsetof(struct sk_buff, __cloned_offset); __u8 cloned:1,nohdr:1,fclone:2,peeked:1,head_frag:1, pfmemalloc:1;#ifdef CONFIG_SKB_EXTENSIONS; __u8 active_extensions;#endif;#ifdef __BIG_ENDIAN_BITFIELD;#define PKT_TYPE_MAX (7 << 5);#else;#define PKT_TYPE_MAX 7;#endif;#define PKT_TYPE_OFFSET() offsetof(struct sk_buff, __pkt_type_offset); __u8 pkt_type:3; __u8 ignore_df:1; __u8 nf_trace:1; __u8 ip_summed:2; __u8 ooo_okay:1; __u8 l4_hash:1; __u8 sw_hash:1; __u8 wifi_acked_valid:1; __u8 wifi_acked:1; __u8 no_fcs:1; __u8 encapsulation:1; __u8 encap_hdr_csum:1; __u8 csum_valid:1;#ifdef __BIG_ENDIAN_BITFIELD;#define PKT_VLAN_PRESENT_BIT 7;#else;#define PKT_VLAN_PRESENT_BIT 0;#endif;#define PKT_VLAN_PRESENT_OFFSET() offsetof(struct sk_buff, __pkt_vlan_present_offset); __u8 vlan_present:1; __u8 csum_complete_sw:1; __u8 csum_level:2; __u8 csum_not_inet:1; __u8 dst_pending_confirm:1;#ifdef CONFIG_IPV6_NDISC_NODETYPE; __u8 ndisc_nodetype:2;#endif; __u8 ipvs_property:1; __u8 inner_protocol_type:1; __u8 remcsum_offload:1;#ifdef CONFIG_NET_SWITCHDEV; __u8 offload_fwd_mark:1; __u8 offload_l3_fwd_mark:1;#endif;#ifdef CONFIG_NET_CLS_ACT; __u8 tc_skip_classify:1; __u8 tc_at_ingress:1;#endif;#ifdef CONFIG_NET_REDIRECT; __u8 redirected:1; __u8 from_ingress:1;#endif;#ifdef CONFIG_TLS_DEVICE; __u8 decrypted:1;#endif;#ifdef CONFIG_NET_SCHED; __u16 tc_index;#endif; union { __wsum csum; struct { __u16 csum_start; __u16 csum_offset; }; }; __u32 priority; int skb_iif; __u32 hash; __be16 vlan_proto; __u16 vlan_tci;#if defined(CONFIG_NET_RX_BUSY_POLL) || defined(CONFIG_XPS); union { unsigned int napi_id; unsigned int sender_cpu; };#endif;#ifdef CONFIG_NETWORK_SECMARK; __u32 secmark;#endif; union { __u32 mark; __u32 reserved_tailroom; }; union { __be16 inner_protocol; __u8 inner_ipproto; }; __u16 inner_transport_header; __u16 inner_network_header; __u16 inner_mac_header; __be16 protocol; __u16 transport_header; __u16 network_header; __u16 mac_header; sk_buff_data_t tail; sk_buff_data_t end; unsigned char *head, *data; unsigned int truesize; refcount_t users;#ifdef CONFIG_SKB_EXTENSIONS; struct skb_ext *extensions;#endif;};Members
{unnamed_union}- anonymous
{unnamed_struct}- anonymous
next- Next buffer in list
prev- Previous buffer in list
{unnamed_union}- anonymous
dev- Device we arrived on/are leaving by
dev_scratch- (akadev) alternate use ofdev whendev would be
NULL rbnode- RB tree node, alternative to next/prev for netem/tcp
list- queue head
{unnamed_union}- anonymous
sk- Socket we are owned by
ip_defrag_offset- (akask) alternate use ofsk, used infragmentation management
{unnamed_union}- anonymous
tstamp- Time we arrived/left
skb_mstamp_ns- (akatstamp) earliest departure time; start pointfor retransmit timer
cb- Control buffer. Free for use by every layer. Put private vars here
{unnamed_union}- anonymous
{unnamed_struct}- anonymous
_skb_refdst- destination entry (with norefcount bit)
destructor- Destruct function
tcp_tsorted_anchor- list structure for TCP (tp->tsorted_sent_queue)
_nfct- Associated connection, if any (with nfctinfo bits)
len- Length of actual data
data_len- Data length
mac_len- Length of link layer header
hdr_len- writable header length of cloned skb
queue_mapping- Queue mapping for multiqueue devices
cloned- Head may be cloned (check refcnt to be sure)
nohdr- Payload reference only, must not modify header
fclone- skbuff clone status
peeked- this packet has been seen already, so stats have beendone for it, don’t do them again
head_frag- skb was allocated from page fragments,not allocated by
kmalloc()orvmalloc(). pfmemalloc- skbuff was allocated from PFMEMALLOC reserves
active_extensions- active extensions (skb_ext_id types)
pkt_type- Packet class
ignore_df- allow local fragmentation
nf_trace- netfilter packet trace flag
ip_summed- Driver fed us an IP checksum
ooo_okay- allow the mapping of a socket to a queue to be changed
l4_hash- indicate hash is a canonical 4-tuple hash over transportports.
sw_hash- indicates hash was computed in software stack
wifi_acked_valid- wifi_acked was set
wifi_acked- whether frame was acked on wifi or not
no_fcs- Request NIC to treat last 4 bytes as Ethernet FCS
encapsulation- indicates the inner headers in the skbuff are valid
encap_hdr_csum- software checksum is needed
csum_valid- checksum is already valid
vlan_present- VLAN tag is present
csum_complete_sw- checksum was completed by software
csum_level- indicates the number of consecutive checksums found inthe packet minus one that have been verified asCHECKSUM_UNNECESSARY (max 3)
csum_not_inet- use CRC32c to resolve CHECKSUM_PARTIAL
dst_pending_confirm- need to confirm neighbour
ndisc_nodetype- router type (from link layer)
ipvs_property- skbuff is owned by ipvs
inner_protocol_type- whether the inner protocol isENCAP_TYPE_ETHER or ENCAP_TYPE_IPPROTO
remcsum_offload- remote checksum offload is enabled
offload_fwd_mark- Packet was L2-forwarded in hardware
offload_l3_fwd_mark- Packet was L3-forwarded in hardware
tc_skip_classify- do not classify packet. set by IFB device
tc_at_ingress- used within tc_classify to distinguish in/egress
redirected- packet was redirected by packet classifier
from_ingress- packet was redirected from the ingress path
decrypted- Decrypted SKB
tc_index- Traffic control index
{unnamed_union}- anonymous
csum- Checksum (must include start/offset pair)
{unnamed_struct}- anonymous
csum_start- Offset from skb->head where checksumming should start
csum_offset- Offset from csum_start where checksum should be stored
priority- Packet queueing priority
skb_iif- ifindex of device we arrived on
hash- the packet hash
vlan_proto- vlan encapsulation protocol
vlan_tci- vlan tag control information
{unnamed_union}- anonymous
napi_id- id of the NAPI struct this skb came from
sender_cpu- (akanapi_id) source CPU in XPS
secmark- security marking
{unnamed_union}- anonymous
mark- Generic packet mark
reserved_tailroom- (akamark) number of bytes of free space availableat the tail of an sk_buff
{unnamed_union}- anonymous
inner_protocol- Protocol (encapsulation)
inner_ipproto- (akainner_protocol) stores ipproto whenskb->inner_protocol_type == ENCAP_TYPE_IPPROTO;
inner_transport_header- Inner transport layer header (encapsulation)
inner_network_header- Network layer header (encapsulation)
inner_mac_header- Link layer header (encapsulation)
protocol- Packet protocol from driver
transport_header- Transport layer header
network_header- Network layer header
mac_header- Link layer header
tail- Tail pointer
end- End pointer
head- Head of buffer
data- Data head pointer
truesize- Buffer size
users- User count - see {datagram,tcp}.c
extensions- allocated extensions, valid if active_extensions is nonzero
- bool
skb_pfmemalloc(const structsk_buff * skb)¶ Test if the skb was allocated from PFMEMALLOC reserves
Parameters
conststructsk_buff*skb- buffer
Parameters
conststructsk_buff*skb- buffer
Description
Returns skb dst_entry, regardless of reference taken or not.
Parameters
structsk_buff*skb- buffer
structdst_entry*dst- dst entry
Description
Sets skb dst, assuming a reference was taken on dst and shouldbe released by skb_dst_drop()
- void
skb_dst_set_noref(structsk_buff * skb, struct dst_entry * dst)¶ sets skb dst, hopefully, without taking reference
Parameters
structsk_buff*skb- buffer
structdst_entry*dst- dst entry
Description
Sets skb dst, assuming a reference was not taken on dst.If dst entry is cached, we do not take reference and dst_releasewill be avoided by refdst_drop. If dst entry is not cached, we takereference, so that last dst_release can destroy the dst immediately.
Parameters
conststructsk_buff*skb- buffer
Parameters
conststructsk_buff*skb- buffer
Parameters
conststructsk_buff*skb- buffer
Parameters
structsk_buff*skb- buffer
Description
Returns true if we can free the skb.
Parameters
unsignedintsize- size to allocate
gfp_tpriority- allocation mask
Description
This function is a convenient wrapper around__alloc_skb().
Parameters
conststructsock*sk- socket
conststructsk_buff*skb- buffer
Description
Returns true if skb is a fast clone, and its clone is not freed.Some drivers callskb_orphan() in their ndo_start_xmit(),so we also check that this didnt happen.
- structsk_buff *
alloc_skb_fclone(unsigned int size, gfp_t priority)¶ allocate a network buffer from fclone cache
Parameters
unsignedintsize- size to allocate
gfp_tpriority- allocation mask
Description
This function is a convenient wrapper around__alloc_skb().
Parameters
structsk_buff*skb- buffer to pad
intpadspace to pad
Ensure that a buffer is followed by a padding area that is zerofilled. Used by network drivers which may DMA or transfer databeyond the buffer end onto the wire.
May return error in out of memory cases. The skb is freed on error.
- int
skb_queue_empty(const struct sk_buff_head * list)¶ check if a queue is empty
Parameters
conststructsk_buff_head*listqueue head
Returns true if the queue is empty, false otherwise.
- bool
skb_queue_empty_lockless(const struct sk_buff_head * list)¶ check if a queue is empty
Parameters
conststructsk_buff_head*listqueue head
Returns true if the queue is empty, false otherwise.This variant can be used in lockless contexts.
- bool
skb_queue_is_last(const struct sk_buff_head * list, const structsk_buff * skb)¶ check if skb is the last entry in the queue
Parameters
conststructsk_buff_head*list- queue head
conststructsk_buff*skbbuffer
Returns true ifskb is the last buffer on the list.
- bool
skb_queue_is_first(const struct sk_buff_head * list, const structsk_buff * skb)¶ check if skb is the first entry in the queue
Parameters
conststructsk_buff_head*list- queue head
conststructsk_buff*skbbuffer
Returns true ifskb is the first buffer on the list.
- structsk_buff *
skb_queue_next(const struct sk_buff_head * list, const structsk_buff * skb)¶ return the next packet in the queue
Parameters
conststructsk_buff_head*list- queue head
conststructsk_buff*skbcurrent buffer
Return the next packet inlist afterskb. It is only valid tocall this if
skb_queue_is_last()evaluates to false.
- structsk_buff *
skb_queue_prev(const struct sk_buff_head * list, const structsk_buff * skb)¶ return the prev packet in the queue
Parameters
conststructsk_buff_head*list- queue head
conststructsk_buff*skbcurrent buffer
Return the prev packet inlist beforeskb. It is only valid tocall this if
skb_queue_is_first()evaluates to false.
Parameters
structsk_buff*skbbuffer to reference
Makes another reference to a socket buffer and returns a pointerto the buffer.
Parameters
conststructsk_buff*skbbuffer to check
Returns true if the buffer was generated with
skb_clone()and isone of multiple shared copies of the buffer. Cloned buffers areshared data so must not be written to under normal circumstances.
Parameters
conststructsk_buff*skbbuffer to check
Returns true if modifying the header part of the buffer requiresthe data to be copied.
Parameters
structsk_buff*skb- buffer to operate on
Parameters
conststructsk_buff*skbbuffer to check
Returns true if more than one person has a reference to thisbuffer.
- structsk_buff *
skb_share_check(structsk_buff * skb, gfp_t pri)¶ check if buffer is shared and if so clone it
Parameters
structsk_buff*skb- buffer to check
gfp_tpripriority for memory allocation
If the buffer is shared the buffer is cloned and the old copydrops a reference. A new clone with a single reference is returned.If the buffer is not shared the original buffer is returned. Whenbeing called from interrupt status or with spinlocks held pri mustbe GFP_ATOMIC.
NULL is returned on a memory allocation failure.
Parameters
structsk_buff*skb- buffer to check
gfp_tpripriority for memory allocation
If the socket buffer is a clone then this function creates a newcopy of the data, drops a reference count on the old copy and returnsthe new copy with the reference count at 1. If the buffer is not a clonethe original buffer is returned. When called with a spinlock held orfrom interrupt statepri must be
GFP_ATOMICNULLis returned on a memory allocation failure.
Parameters
conststructsk_buff_head*list_list to peek at
Peek an
sk_buff. Unlike most other operations you _MUST_be careful with this one. A peek leaves the buffer on thelist and someone else may run off with it. You must holdthe appropriate locks or have a private queue to do this.Returns
NULLfor an empty list or a pointer to the head element.The reference count is not incremented and the reference is thereforevolatile. Use with caution.
- structsk_buff *
__skb_peek(const struct sk_buff_head * list_)¶ peek at the head of a non-empty
sk_buff_head
Parameters
conststructsk_buff_head*list_list to peek at
Like
skb_peek(), but the caller knows that the list is not empty.
- structsk_buff *
skb_peek_next(structsk_buff * skb, const struct sk_buff_head * list_)¶ peek skb following the given one from a queue
Parameters
structsk_buff*skb- skb to start from
conststructsk_buff_head*list_list to peek at
Returns
NULLwhen the end of the list is met or a pointer to thenext element. The reference count is not incremented and thereference is therefore volatile. Use with caution.
Parameters
conststructsk_buff_head*list_list to peek at
Peek an
sk_buff. Unlike most other operations you _MUST_be careful with this one. A peek leaves the buffer on thelist and someone else may run off with it. You must holdthe appropriate locks or have a private queue to do this.Returns
NULLfor an empty list or a pointer to the tail element.The reference count is not incremented and the reference is thereforevolatile. Use with caution.
- __u32
skb_queue_len(const struct sk_buff_head * list_)¶ get queue length
Parameters
conststructsk_buff_head*list_list to measure
Return the length of an
sk_buffqueue.
- __u32
skb_queue_len_lockless(const struct sk_buff_head * list_)¶ get queue length
Parameters
conststructsk_buff_head*list_list to measure
Return the length of an
sk_buffqueue.This variant can be used in lockless contexts.
- void
__skb_queue_head_init(struct sk_buff_head * list)¶ initialize non-spinlock portions of sk_buff_head
Parameters
structsk_buff_head*listqueue to initialize
This initializes only the list and queue length aspects ofan sk_buff_head object. This allows to initialize the listaspects of an sk_buff_head without reinitializing things likethe spinlock. It can also be used for on-stack sk_buff_headobjects where the spinlock is known to not be used.
- void
skb_queue_splice(const struct sk_buff_head * list, struct sk_buff_head * head)¶ join two skb lists, this is designed for stacks
Parameters
conststructsk_buff_head*list- the new list to add
structsk_buff_head*head- the place to add it in the first list
- void
skb_queue_splice_init(struct sk_buff_head * list, struct sk_buff_head * head)¶ join two skb lists and reinitialise the emptied list
Parameters
structsk_buff_head*list- the new list to add
structsk_buff_head*headthe place to add it in the first list
The list atlist is reinitialised
- void
skb_queue_splice_tail(const struct sk_buff_head * list, struct sk_buff_head * head)¶ join two skb lists, each list being a queue
Parameters
conststructsk_buff_head*list- the new list to add
structsk_buff_head*head- the place to add it in the first list
- void
skb_queue_splice_tail_init(struct sk_buff_head * list, struct sk_buff_head * head)¶ join two skb lists and reinitialise the emptied list
Parameters
structsk_buff_head*list- the new list to add
structsk_buff_head*headthe place to add it in the first list
Each of the lists is a queue.The list atlist is reinitialised
- void
__skb_queue_after(struct sk_buff_head * list, structsk_buff * prev, structsk_buff * newsk)¶ queue a buffer at the list head
Parameters
structsk_buff_head*list- list to use
structsk_buff*prev- place after this buffer
structsk_buff*newskbuffer to queue
Queue a buffer int the middle of a list. This function takes no locksand you must therefore hold required locks before calling it.
A buffer cannot be placed on two lists at the same time.
- void
__skb_queue_head(struct sk_buff_head * list, structsk_buff * newsk)¶ queue a buffer at the list head
Parameters
structsk_buff_head*list- list to use
structsk_buff*newskbuffer to queue
Queue a buffer at the start of a list. This function takes no locksand you must therefore hold required locks before calling it.
A buffer cannot be placed on two lists at the same time.
- void
__skb_queue_tail(struct sk_buff_head * list, structsk_buff * newsk)¶ queue a buffer at the list tail
Parameters
structsk_buff_head*list- list to use
structsk_buff*newskbuffer to queue
Queue a buffer at the end of a list. This function takes no locksand you must therefore hold required locks before calling it.
A buffer cannot be placed on two lists at the same time.
Parameters
structsk_buff_head*listlist to dequeue from
Remove the head of the list. This function does not take any locksso must be used with appropriate locks held only. The head item isreturned or
NULLif the list is empty.
Parameters
structsk_buff_head*listlist to dequeue from
Remove the tail of the list. This function does not take any locksso must be used with appropriate locks held only. The tail item isreturned or
NULLif the list is empty.
- void
__skb_fill_page_desc(structsk_buff * skb, int i, struct page * page, int off, int size)¶ initialise a paged fragment in an skb
Parameters
structsk_buff*skb- buffer containing fragment to be initialised
inti- paged fragment index to initialise
structpage*page- the page to use for this fragment
intoff- the offset to the data withpage
intsize- the length of the data
Description
Initialises thei’th fragment ofskb to point tosize bytes atoffsetoff withinpage.
Does not take any additional reference on the fragment.
- void
skb_fill_page_desc(structsk_buff * skb, int i, struct page * page, int off, int size)¶ initialise a paged fragment in an skb
Parameters
structsk_buff*skb- buffer containing fragment to be initialised
inti- paged fragment index to initialise
structpage*page- the page to use for this fragment
intoff- the offset to the data withpage
intsize- the length of the data
Description
As per__skb_fill_page_desc() – initialises thei’th fragment ofskb to point tosize bytes at offsetoff withinpage. Inaddition updatesskb such thati is the last fragment.
Does not take any additional reference on the fragment.
Parameters
conststructsk_buff*skbbuffer to check
Return the number of bytes of free space at the head of an
sk_buff.
Parameters
conststructsk_buff*skbbuffer to check
Return the number of bytes of free space at the tail of an sk_buff
Parameters
conststructsk_buff*skbbuffer to check
Return the number of bytes of free space at the tail of an sk_buffallocated by sk_stream_alloc()
Parameters
structsk_buff*skb- buffer to alter
intlenbytes to move
Increase the headroom of an empty
sk_buffby reducing the tailroom. This is only allowed for an empty buffer.
- void
skb_tailroom_reserve(structsk_buff * skb, unsigned int mtu, unsigned int needed_tailroom)¶ adjust reserved_tailroom
Parameters
structsk_buff*skb- buffer to alter
unsignedintmtu- maximum amount of headlen permitted
unsignedintneeded_tailroomminimum amount of reserved_tailroom
Set reserved_tailroom so that headlen can be as large as possible butnot larger than mtu and tailroom cannot be smaller thanneeded_tailroom.The required headroom should already have been reserved before usingthis function.
- void
pskb_trim_unique(structsk_buff * skb, unsigned int len)¶ remove end from a paged unique (not cloned) buffer
Parameters
structsk_buff*skb- buffer to alter
unsignedintlennew length
This is identical to pskb_trim except that the caller knows thatthe skb is not cloned so we should never get an error due to out-of-memory.
Parameters
structsk_buff*skbbuffer to orphan
If a buffer currently has an owner then we call the owner’sdestructor function and make theskb unowned. The buffer continuesto exist but is no longer charged to its former owner.
Parameters
structsk_buff*skb- buffer to orphan frags from
gfp_tgfp_maskallocation mask for replacement pages
For each frag in the SKB which needs a destructor (i.e. has anowner) create a copy of that frag and release the originalpage by calling the destructor.
- void
__skb_queue_purge(struct sk_buff_head * list)¶ empty a list
Parameters
structsk_buff_head*listlist to empty
Delete all buffers on an
sk_bufflist. Each buffer is removed fromthe list and one reference dropped. This function does not take thelist lock and the caller must hold the relevant locks to use it.
- structsk_buff *
netdev_alloc_skb(structnet_device * dev, unsigned int length)¶ allocate an skbuff for rx on a specific device
Parameters
structnet_device*dev- network device to receive on
unsignedintlengthlength to allocate
Allocate a new
sk_buffand assign it a usage count of one. Thebuffer has unspecified headroom built in. Users should allocatethe headroom they think they need without accounting for thebuilt in space. The built in space is used for optimisations.NULLis returned if there is no free memory. Although this functionallocates memory it can be called from an interrupt.
- struct page *
__dev_alloc_pages(gfp_t gfp_mask, unsigned int order)¶ allocate page for network Rx
Parameters
gfp_tgfp_mask- allocation priority. Set __GFP_NOMEMALLOC if not for network Rx
unsignedintorder- size of the allocation
Description
Allocate a new page.
NULL is returned if there is no free memory.
- struct page *
__dev_alloc_page(gfp_t gfp_mask)¶ allocate a page for network Rx
Parameters
gfp_tgfp_mask- allocation priority. Set __GFP_NOMEMALLOC if not for network Rx
Description
Allocate a new page.
NULL is returned if there is no free memory.
- void
skb_propagate_pfmemalloc(struct page * page, structsk_buff * skb)¶ Propagate pfmemalloc if skb is allocated after RX page
Parameters
structpage*page- The page that was allocated from skb_alloc_page
structsk_buff*skb- The skb that may need pfmemalloc set
- unsigned int
skb_frag_off(const skb_frag_t * frag)¶ Returns the offset of a skb fragment
Parameters
constskb_frag_t*frag- the paged fragment
- void
skb_frag_off_add(skb_frag_t * frag, int delta)¶ Increments the offset of a skb fragment bydelta
Parameters
skb_frag_t*frag- skb fragment
intdelta- value to add
- void
skb_frag_off_set(skb_frag_t * frag, unsigned int offset)¶ Sets the offset of a skb fragment
Parameters
skb_frag_t*frag- skb fragment
unsignedintoffset- offset of fragment
- void
skb_frag_off_copy(skb_frag_t * fragto, const skb_frag_t * fragfrom)¶ Sets the offset of a skb fragment from another fragment
Parameters
skb_frag_t*fragto- skb fragment where offset is set
constskb_frag_t*fragfrom- skb fragment offset is copied from
- struct page *
skb_frag_page(const skb_frag_t * frag)¶ retrieve the page referred to by a paged fragment
Parameters
constskb_frag_t*frag- the paged fragment
Description
Returns thestructpage associated withfrag.
- void
__skb_frag_ref(skb_frag_t * frag)¶ take an addition reference on a paged fragment.
Parameters
skb_frag_t*frag- the paged fragment
Description
Takes an additional reference on the paged fragmentfrag.
- void
skb_frag_ref(structsk_buff * skb, int f)¶ take an addition reference on a paged fragment of an skb.
Parameters
structsk_buff*skb- the buffer
intf- the fragment offset.
Description
Takes an additional reference on thef’th paged fragment ofskb.
- void
__skb_frag_unref(skb_frag_t * frag)¶ release a reference on a paged fragment.
Parameters
skb_frag_t*frag- the paged fragment
Description
Releases a reference on the paged fragmentfrag.
Parameters
structsk_buff*skb- the buffer
intf- the fragment offset
Description
Releases a reference on thef’th paged fragment ofskb.
- void *
skb_frag_address(const skb_frag_t * frag)¶ gets the address of the data contained in a paged fragment
Parameters
constskb_frag_t*frag- the paged fragment buffer
Description
Returns the address of the data withinfrag. The page must alreadybe mapped.
- void *
skb_frag_address_safe(const skb_frag_t * frag)¶ gets the address of the data contained in a paged fragment
Parameters
constskb_frag_t*frag- the paged fragment buffer
Description
Returns the address of the data withinfrag. Checks that the pageis mapped and returnsNULL otherwise.
- void
skb_frag_page_copy(skb_frag_t * fragto, const skb_frag_t * fragfrom)¶ sets the page in a fragment from another fragment
Parameters
skb_frag_t*fragto- skb fragment where page is set
constskb_frag_t*fragfrom- skb fragment page is copied from
- void
__skb_frag_set_page(skb_frag_t * frag, struct page * page)¶ sets the page contained in a paged fragment
Parameters
skb_frag_t*frag- the paged fragment
structpage*page- the page to set
Description
Sets the fragmentfrag to containpage.
- void
skb_frag_set_page(structsk_buff * skb, int f, struct page * page)¶ sets the page contained in a paged fragment of an skb
Parameters
structsk_buff*skb- the buffer
intf- the fragment offset
structpage*page- the page to set
Description
Sets thef’th fragment ofskb to containpage.
- dma_addr_t
skb_frag_dma_map(structdevice * dev, const skb_frag_t * frag, size_t offset, size_t size, enum dma_data_direction dir)¶ maps a paged fragment via the DMA API
Parameters
structdevice*dev- the device to map the fragment to
constskb_frag_t*frag- the paged fragment to map
size_toffset- the offset within the fragment (starting at thefragment’s own offset)
size_tsize- the number of bytes to map
enumdma_data_directiondir- the direction of the mapping (
PCI_DMA_*)
Description
Maps the page associated withfrag todevice.
- int
skb_clone_writable(const structsk_buff * skb, unsigned int len)¶ is the header of a clone writable
Parameters
conststructsk_buff*skb- buffer to check
unsignedintlenlength up to which to write
Returns true if modifying the header part of the cloned bufferdoes not requires the data to be copied.
Parameters
structsk_buff*skb- buffer to cow
unsignedintheadroomneeded headroom
If the skb passed lacks sufficient headroom or its data partis shared, data is reallocated. If reallocation fails, an erroris returned and original skb is not changed.
The result is skb with writable area skb->head…skb->tailand at leastheadroom of space at head.
- int
skb_cow_head(structsk_buff * skb, unsigned int headroom)¶ skb_cow but only making the head writable
Parameters
structsk_buff*skb- buffer to cow
unsignedintheadroomneeded headroom
This function is identical to skb_cow except that we replace theskb_cloned check by skb_header_cloned. It should be used whenyou only need to push on some header and do not need to modifythe data.
Parameters
structsk_buff*skb- buffer to pad
unsignedintlenminimal length
Pads up a buffer to ensure the trailing bytes exist and areblanked. If the buffer already contains sufficient data itis untouched. Otherwise it is extended. Returns zero onsuccess. The skb is freed on error.
- int
__skb_put_padto(structsk_buff * skb, unsigned int len, bool free_on_error)¶ increase size and pad an skbuff up to a minimal size
Parameters
structsk_buff*skb- buffer to pad
unsignedintlen- minimal length
boolfree_on_errorfree buffer on error
Pads up a buffer to ensure the trailing bytes exist and areblanked. If the buffer already contains sufficient data itis untouched. Otherwise it is extended. Returns zero onsuccess. The skb is freed on error iffree_on_error is true.
- int
skb_put_padto(structsk_buff * skb, unsigned int len)¶ increase size and pad an skbuff up to a minimal size
Parameters
structsk_buff*skb- buffer to pad
unsignedintlenminimal length
Pads up a buffer to ensure the trailing bytes exist and areblanked. If the buffer already contains sufficient data itis untouched. Otherwise it is extended. Returns zero onsuccess. The skb is freed on error.
Parameters
structsk_buff*skbbuffer to linarize
If there is no free memory -ENOMEM is returned, otherwise zerois returned and the old skb data released.
Parameters
conststructsk_buff*skb- buffer to test
Description
Return true if the skb has at least one frag that might be modifiedby an external entity (as in vmsplice()/sendfile())
Parameters
structsk_buff*skbbuffer to process
If there is no free memory -ENOMEM is returned, otherwise zerois returned and the old skb data released.
- void
skb_postpull_rcsum(structsk_buff * skb, const void * start, unsigned int len)¶ update checksum for received skb after pull
Parameters
structsk_buff*skb- buffer to update
constvoid*start- start of data before pull
unsignedintlenlength of data pulled
After doing a pull on a received packet, you need to call this toupdate the CHECKSUM_COMPLETE checksum, or set ip_summed toCHECKSUM_NONE so that it can be recomputed from scratch.
- void
skb_postpush_rcsum(structsk_buff * skb, const void * start, unsigned int len)¶ update checksum for received skb after push
Parameters
structsk_buff*skb- buffer to update
constvoid*start- start of data after push
unsignedintlenlength of data pushed
After doing a push on a received packet, you need to call this toupdate the CHECKSUM_COMPLETE checksum.
Parameters
structsk_buff*skb- buffer to update
unsignedintlenlength of data pulled
This function performs an skb_push on the packet and updatesthe CHECKSUM_COMPLETE checksum. It should be used onreceive path processing instead of skb_push unless you knowthat the checksum difference is zero (e.g., a valid IP header)or you are setting ip_summed to CHECKSUM_NONE.
Parameters
structsk_buff*skb- buffer to trim
unsignedintlennew length
This is exactly the same as pskb_trim except that it ensures thechecksum of received packets are still valid after the operation.It can change skb pointers.
- bool
skb_needs_linearize(structsk_buff * skb, netdev_features_t features)¶ check if we need to linearize a given skb depending on the given device features.
Parameters
structsk_buff*skb- socket buffer to check
netdev_features_tfeaturesnet device features
Returns true if either:1. skb has frag_list and the device doesn’t support FRAGLIST, or2. skb is fragmented and the device does not support SG.
- void
skb_get_timestamp(const structsk_buff * skb, struct __kernel_old_timeval * stamp)¶ get timestamp from a skb
Parameters
conststructsk_buff*skb- skb to get stamp from
struct__kernel_old_timeval*stamppointer to struct __kernel_old_timeval to store stamp in
Timestamps are stored in the skb as offsets to a base timestamp.This function converts the offset back to a struct timeval and storesit in stamp.
- void
skb_complete_tx_timestamp(structsk_buff * skb, structskb_shared_hwtstamps * hwtstamps)¶ deliver cloned skb with tx timestamps
Parameters
structsk_buff*skb- clone of the original outgoing packet
structskb_shared_hwtstamps*hwtstamps- hardware time stamps
Description
PHY drivers may accept clones of transmitted packets fortimestamping via their phy_driver.txtstamp method. These driversmust call this function to return the skb back to the stack with atimestamp.
- void
skb_tstamp_tx(structsk_buff * orig_skb, structskb_shared_hwtstamps * hwtstamps)¶ queue clone of skb with send time stamps
Parameters
structsk_buff*orig_skb- the original outgoing packet
structskb_shared_hwtstamps*hwtstamps- hardware time stamps, may be NULL if not available
Description
If the skb has a socket associated, then this function clones theskb (thus sharing the actual data and optional structures), storesthe optional hardware time stamping information (if non NULL) orgenerates a software time stamp (otherwise), then queues the cloneto the error queue of the socket. Errors are silently ignored.
Parameters
structsk_buff*skb- A socket buffer.
Description
Ethernet MAC Drivers should call this function in their hard_xmit()function immediately before giving the sk_buff to the MAC hardware.
Specifically, one should make absolutely sure that this function iscalled before TX completion of this packet can trigger. Otherwisethe packet could potentially already be freed.
Parameters
structsk_buff*skb- the original outgoing packet
boolacked- ack status
Parameters
structsk_buff*skbpacket to process
This function calculates the checksum over the entire packet plusthe value of skb->csum. The latter can be used to supply thechecksum of a pseudo header as used by TCP/UDP. It returns thechecksum.
For protocols that contain complete checksums such as ICMP/TCP/UDP,this function can be used to verify that checksum on receivedpackets. In that case the function should return zero if thechecksum is correct. In particular, this function will return zeroif skb->ip_summed is CHECKSUM_UNNECESSARY which indicates that thehardware has already verified the correctness of the checksum.
- struct
skb_ext¶ sk_buff extensions
Definition
struct skb_ext { refcount_t refcnt; u8 offset[SKB_EXT_NUM]; u8 chunks; char data[] ;};Members
refcnt- 1 on allocation, deallocated on 0
offset- offset to add todata to obtain extension address
chunks- size currently allocated, stored in SKB_EXT_ALIGN_SHIFT units
data- start of extension data, variable sized
Note
- offsets/lengths are stored in chunks of 8 bytes, this allows
- to use ‘u8’ types while allowing up to 2kb worth of extension data.
Parameters
conststructsk_buff*skb- skb to check
Description
fresh skbs have their ip_summed set to CHECKSUM_NONE.Instead of forcing ip_summed to CHECKSUM_NONE, we canuse this helper, to document places where we make this assertion.
Parameters
conststructsk_buff*skb- skb to check
Description
The head on skbs build around a head frag can be removed if they arenot cloned. This function returns true if the skb head is locked downdue to either being allocated via kmalloc, or by being a clone withmultiple references to the head.
- struct
sock_common¶ minimal network layer representation of sockets
Definition
struct sock_common { union { __addrpair skc_addrpair; struct { __be32 skc_daddr; __be32 skc_rcv_saddr; }; }; union { unsigned int skc_hash; __u16 skc_u16hashes[2]; }; union { __portpair skc_portpair; struct { __be16 skc_dport; __u16 skc_num; }; }; unsigned short skc_family; volatile unsigned char skc_state; unsigned char skc_reuse:4; unsigned char skc_reuseport:1; unsigned char skc_ipv6only:1; unsigned char skc_net_refcnt:1; int skc_bound_dev_if; union { struct hlist_node skc_bind_node; struct hlist_node skc_portaddr_node; }; struct proto *skc_prot; possible_net_t skc_net;#if IS_ENABLED(CONFIG_IPV6); struct in6_addr skc_v6_daddr; struct in6_addr skc_v6_rcv_saddr;#endif; atomic64_t skc_cookie; union { unsigned long skc_flags; struct sock *skc_listener; struct inet_timewait_death_row *skc_tw_dr; }; union { struct hlist_node skc_node; struct hlist_nulls_node skc_nulls_node; }; unsigned short skc_tx_queue_mapping;#ifdef CONFIG_XPS; unsigned short skc_rx_queue_mapping;#endif; union { int skc_incoming_cpu; u32 skc_rcv_wnd; u32 skc_tw_rcv_nxt; }; refcount_t skc_refcnt;};Members
{unnamed_union}- anonymous
skc_addrpair- 8-byte-aligned __u64 union ofskc_daddr &skc_rcv_saddr
{unnamed_struct}- anonymous
skc_daddr- Foreign IPv4 addr
skc_rcv_saddr- Bound local IPv4 addr
{unnamed_union}- anonymous
skc_hash- hash value used with various protocol lookup tables
skc_u16hashes- two u16 hash values used by UDP lookup tables
{unnamed_union}- anonymous
skc_portpair- __u32 union ofskc_dport &skc_num
{unnamed_struct}- anonymous
skc_dport- placeholder for inet_dport/tw_dport
skc_num- placeholder for inet_num/tw_num
skc_family- network address family
skc_state- Connection state
skc_reuseSO_REUSEADDRsettingskc_reuseportSO_REUSEPORTsettingskc_ipv6only- socket is IPV6 only
skc_net_refcnt- socket is using net ref counting
skc_bound_dev_if- bound device index if != 0
{unnamed_union}- anonymous
skc_bind_node- bind hash linkage for various protocol lookup tables
skc_portaddr_node- second hash linkage for UDP/UDP-Lite protocol
skc_prot- protocol handlers inside a network family
skc_net- reference to the network namespace of this socket
skc_v6_daddr- IPV6 destination address
skc_v6_rcv_saddr- IPV6 source address
skc_cookie- socket’s cookie value
{unnamed_union}- anonymous
skc_flags- place holder for sk_flags
SO_LINGER(l_onoff),SO_BROADCAST,SO_KEEPALIVE,SO_OOBINLINEsettings,SO_TIMESTAMPINGsettings skc_listener- connection request listener socket (aka rsk_listener)[union withskc_flags]
skc_tw_dr- (aka tw_dr) ptr to
structinet_timewait_death_row[union withskc_flags] {unnamed_union}- anonymous
skc_node- main hash linkage for various protocol lookup tables
skc_nulls_node- main hash linkage for TCP/UDP/UDP-Lite protocol
skc_tx_queue_mapping- tx queue number for this connection
skc_rx_queue_mapping- rx queue number for this connection
{unnamed_union}- anonymous
skc_incoming_cpu- record/match cpu processing incoming packets
skc_rcv_wnd- (aka rsk_rcv_wnd) TCP receive window size (possibly scaled)[union withskc_incoming_cpu]
skc_tw_rcv_nxt- (aka tw_rcv_nxt) TCP window next expected seq number[union withskc_incoming_cpu]
skc_refcntreference count
This is the minimal network layer representation of sockets, the headerfor struct sock and struct inet_timewait_sock.
- struct
sock¶ network layer representation of sockets
Definition
struct sock { struct sock_common __sk_common;#define sk_node __sk_common.skc_node;#define sk_nulls_node __sk_common.skc_nulls_node;#define sk_refcnt __sk_common.skc_refcnt;#define sk_tx_queue_mapping __sk_common.skc_tx_queue_mapping;#ifdef CONFIG_XPS;#define sk_rx_queue_mapping __sk_common.skc_rx_queue_mapping;#endif;#define sk_dontcopy_begin __sk_common.skc_dontcopy_begin;#define sk_dontcopy_end __sk_common.skc_dontcopy_end;#define sk_hash __sk_common.skc_hash;#define sk_portpair __sk_common.skc_portpair;#define sk_num __sk_common.skc_num;#define sk_dport __sk_common.skc_dport;#define sk_addrpair __sk_common.skc_addrpair;#define sk_daddr __sk_common.skc_daddr;#define sk_rcv_saddr __sk_common.skc_rcv_saddr;#define sk_family __sk_common.skc_family;#define sk_state __sk_common.skc_state;#define sk_reuse __sk_common.skc_reuse;#define sk_reuseport __sk_common.skc_reuseport;#define sk_ipv6only __sk_common.skc_ipv6only;#define sk_net_refcnt __sk_common.skc_net_refcnt;#define sk_bound_dev_if __sk_common.skc_bound_dev_if;#define sk_bind_node __sk_common.skc_bind_node;#define sk_prot __sk_common.skc_prot;#define sk_net __sk_common.skc_net;#define sk_v6_daddr __sk_common.skc_v6_daddr;#define sk_v6_rcv_saddr __sk_common.skc_v6_rcv_saddr;#define sk_cookie __sk_common.skc_cookie;#define sk_incoming_cpu __sk_common.skc_incoming_cpu;#define sk_flags __sk_common.skc_flags;#define sk_rxhash __sk_common.skc_rxhash; socket_lock_t sk_lock; atomic_t sk_drops; int sk_rcvlowat; struct sk_buff_head sk_error_queue; struct sk_buff *sk_rx_skb_cache; struct sk_buff_head sk_receive_queue; struct { atomic_t rmem_alloc; int len; struct sk_buff *head; struct sk_buff *tail; } sk_backlog;#define sk_rmem_alloc sk_backlog.rmem_alloc; int sk_forward_alloc;#ifdef CONFIG_NET_RX_BUSY_POLL; unsigned int sk_ll_usec; unsigned int sk_napi_id;#endif; int sk_rcvbuf; struct sk_filter __rcu *sk_filter; union { struct socket_wq __rcu *sk_wq; };#ifdef CONFIG_XFRM; struct xfrm_policy __rcu *sk_policy[2];#endif; struct dst_entry *sk_rx_dst; struct dst_entry __rcu *sk_dst_cache; atomic_t sk_omem_alloc; int sk_sndbuf; int sk_wmem_queued; refcount_t sk_wmem_alloc; unsigned long sk_tsq_flags; union { struct sk_buff *sk_send_head; struct rb_root tcp_rtx_queue; }; struct sk_buff *sk_tx_skb_cache; struct sk_buff_head sk_write_queue; __s32 sk_peek_off; int sk_write_pending; __u32 sk_dst_pending_confirm; u32 sk_pacing_status; long sk_sndtimeo; struct timer_list sk_timer; __u32 sk_priority; __u32 sk_mark; unsigned long sk_pacing_rate; unsigned long sk_max_pacing_rate; struct page_frag sk_frag; netdev_features_t sk_route_caps; netdev_features_t sk_route_nocaps; netdev_features_t sk_route_forced_caps; int sk_gso_type; unsigned int sk_gso_max_size; gfp_t sk_allocation; __u32 sk_txhash; u8 sk_padding : 1,sk_kern_sock : 1,sk_no_check_tx : 1,sk_no_check_rx : 1, sk_userlocks : 4; u8 sk_pacing_shift; u16 sk_type; u16 sk_protocol; u16 sk_gso_max_segs; unsigned long sk_lingertime; struct proto *sk_prot_creator; rwlock_t sk_callback_lock; int sk_err, sk_err_soft; u32 sk_ack_backlog; u32 sk_max_ack_backlog; kuid_t sk_uid; struct pid *sk_peer_pid; const struct cred *sk_peer_cred; long sk_rcvtimeo; ktime_t sk_stamp;#if BITS_PER_LONG==32; seqlock_t sk_stamp_seq;#endif; u16 sk_tsflags; u8 sk_shutdown; u32 sk_tskey; atomic_t sk_zckey; u8 sk_clockid; u8 sk_txtime_deadline_mode : 1,sk_txtime_report_errors : 1, sk_txtime_unused : 6; struct socket *sk_socket; void *sk_user_data;#ifdef CONFIG_SECURITY; void *sk_security;#endif; struct sock_cgroup_data sk_cgrp_data; struct mem_cgroup *sk_memcg; void (*sk_state_change)(struct sock *sk); void (*sk_data_ready)(struct sock *sk); void (*sk_write_space)(struct sock *sk); void (*sk_error_report)(struct sock *sk); int (*sk_backlog_rcv)(struct sock *sk, struct sk_buff *skb);#ifdef CONFIG_SOCK_VALIDATE_XMIT; struct sk_buff* (*sk_validate_xmit_skb)(struct sock *sk,struct net_device *dev, struct sk_buff *skb);#endif; void (*sk_destruct)(struct sock *sk); struct sock_reuseport __rcu *sk_reuseport_cb;#ifdef CONFIG_BPF_SYSCALL; struct bpf_sk_storage __rcu *sk_bpf_storage;#endif; struct rcu_head sk_rcu;};Members
__sk_common- shared layout with inet_timewait_sock
sk_lock- synchronizer
sk_drops- raw/udp drops counter
sk_rcvlowatSO_RCVLOWATsettingsk_error_queue- rarely used
sk_rx_skb_cache- cache copy of recently accessed RX skb
sk_receive_queue- incoming packets
sk_backlog- always used with the per-socket spinlock held
sk_forward_alloc- space allocated forward
sk_ll_usec- usecs to busypoll when there is no data
sk_napi_id- id of the last napi context to receive data for sk
sk_rcvbuf- size of receive buffer in bytes
sk_filter- socket filtering instructions
{unnamed_union}- anonymous
sk_wq- sock wait queue and async head
sk_policy- flow policy
sk_rx_dst- receive input route used by early demux
sk_dst_cache- destination cache
sk_omem_alloc- “o” is “option” or “other”
sk_sndbuf- size of send buffer in bytes
sk_wmem_queued- persistent queue size
sk_wmem_alloc- transmit queue bytes committed
sk_tsq_flags- TCP Small Queues flags
{unnamed_union}- anonymous
sk_send_head- front of stuff to transmit
tcp_rtx_queue- TCP re-transmit queue [union withsk_send_head]
sk_tx_skb_cache- cache copy of recently accessed TX skb
sk_write_queue- Packet sending queue
sk_peek_off- current peek_offset value
sk_write_pending- a write to stream socket waits to start
sk_dst_pending_confirm- need to confirm neighbour
sk_pacing_status- Pacing status (requested, handled by sch_fq)
sk_sndtimeoSO_SNDTIMEOsettingsk_timer- sock cleanup timer
sk_prioritySO_PRIORITYsettingsk_mark- generic packet mark
sk_pacing_rate- Pacing rate (if supported by transport/packet scheduler)
sk_max_pacing_rate- Maximum pacing rate (
SO_MAX_PACING_RATE) sk_frag- cached page frag
sk_route_caps- route capabilities (e.g.
NETIF_F_TSO) sk_route_nocaps- forbidden route capabilities (e.g NETIF_F_GSO_MASK)
sk_route_forced_caps- static, forced route capabilities(set in tcp_init_sock())
sk_gso_type- GSO type (e.g.
SKB_GSO_TCPV4) sk_gso_max_size- Maximum GSO segment size to build
sk_allocation- allocation mode
sk_txhash- computed flow hash for use on transmit
sk_padding- unused element for alignment
sk_kern_sock- True if sock is using kernel lock classes
sk_no_check_txSO_NO_CHECKsetting, set checksum in TX packetssk_no_check_rx- allow zero checksum in RX packets
sk_userlocksSO_SNDBUFandSO_RCVBUFsettingssk_pacing_shift- scaling factor for TCP Small Queues
sk_type- socket type (
SOCK_STREAM, etc) sk_protocol- which protocol this socket belongs in this network family
sk_gso_max_segs- Maximum number of GSO segments
sk_lingertimeSO_LINGERl_linger settingsk_prot_creator- sk_prot of original sock creator (see ipv6_setsockopt,IPV6_ADDRFORM for instance)
sk_callback_lock- used with the callbacks in the end of this struct
sk_err- last error
sk_err_soft- errors that don’t cause failure but are the cause of apersistent failure not just ‘timed out’
sk_ack_backlog- current listen backlog
sk_max_ack_backlog- listen backlog set in listen()
sk_uid- user id of owner
sk_peer_pidstructpidfor this socket’s peersk_peer_credSO_PEERCREDsettingsk_rcvtimeoSO_RCVTIMEOsettingsk_stamp- time stamp of last packet received
sk_stamp_seq- lock for accessing sk_stamp on 32 bit architectures only
sk_tsflags- SO_TIMESTAMPING socket options
sk_shutdown- mask of
SEND_SHUTDOWNand/orRCV_SHUTDOWN sk_tskey- counter to disambiguate concurrent tstamp requests
sk_zckey- counter to order MSG_ZEROCOPY notifications
sk_clockid- clockid used by time-based scheduling (SO_TXTIME)
sk_txtime_deadline_mode- set deadline mode for SO_TXTIME
sk_txtime_report_errors- set report errors mode for SO_TXTIME
sk_txtime_unused- unused txtime flags
sk_socket- Identd and reporting IO signals
sk_user_data- RPC layer private data
sk_security- used by security modules
sk_cgrp_data- cgroup data for this cgroup
sk_memcg- this socket’s memory cgroup association
sk_state_change- callback to indicate change in the state of the sock
sk_data_ready- callback to indicate there is data to be processed
sk_write_space- callback to indicate there is bf sending space available
sk_error_report- callback to indicate errors (e.g.
MSG_ERRQUEUE) sk_backlog_rcv- callback to process the backlog
sk_validate_xmit_skb- ptr to an optional validate function
sk_destruct- called at sock freeing time, i.e. when all refcnt == 0
sk_reuseport_cb- reuseport group container
sk_bpf_storage- ptr to cache and control for bpf_sk_storage
sk_rcu- used during RCU grace period
Parameters
conststructsock*sk- socket
sk_for_each_entry_offset_rcu(tpos,pos,head,offset)¶iterate over a list at a given struct offset
Parameters
tpos- the type * to use as a loop cursor.
pos- the
structhlist_nodeto use as a loop cursor. head- the head for your list.
offset- offset of hlist_node within the struct.
Parameters
structsock*sk- socket
boolslow- slow mode
Description
fast unlock socket for user context.If slow mode is on, we call regular release_sock()
Parameters
conststructsock*sk- socket
Return
sk_wmem_alloc minus initial offset of one
Parameters
conststructsock*sk- socket
Return
sk_rmem_alloc
Parameters
conststructsock*sk- socket
Return
true if socket has write or read allocations
- bool
skwq_has_sleeper(struct socket_wq * wq)¶ check if there are any waiting processes
Parameters
structsocket_wq*wq- struct socket_wq
Return
true if socket_wq has waiting processes
Description
The purpose of the skwq_has_sleeper and sock_poll_wait is to wrap the memorybarrier call. They were added due to the race found within the tcp code.
Consider following tcp code paths:
CPU1 CPU2sys_select receive packet... ...__add_wait_queue update tp->rcv_nxt... ...tp->rcv_nxt check sock_def_readable... {schedule rcu_read_lock(); wq = rcu_dereference(sk->sk_wq); if (wq && waitqueue_active(&wq->wait)) wake_up_interruptible(&wq->wait) ... }The race for tcp fires when the __add_wait_queue changes done by CPU1 stayin its cache, and so does the tp->rcv_nxt update on CPU2 side. The CPU1could then endup calling schedule and sleep forever if there are no moredata on the socket.
- void
sock_poll_wait(struct file * filp, structsocket * sock, poll_table * p)¶ place memory barrier behind the poll_wait call.
Parameters
structfile*filp- file
structsocket*sock- socket to wait on
poll_table*p- poll_table
Description
See the comments in the wq_has_sleeper function.
Parameters
structsock*sk- socket
Description
Use the per task page_frag instead of the per socket one foroptimization when we know that we’re in the normal context and ownseverything that’s associated withcurrent.
gfpflags_allow_blocking() isn’t enough here as direct reclaim may nestinside other socket operations and end up recursing intosk_page_frag()while it’s already in use.
Return
a per task page_frag if context allows that,otherwise a per socket one.
- void
_sock_tx_timestamp(structsock * sk, __u16 tsflags, __u8 * tx_flags, __u32 * tskey)¶ checks whether the outgoing packet is to be time stamped
Parameters
structsock*sk- socket sending this packet
__u16tsflags- timestamping flags to use
__u8*tx_flags- completed with instructions for time stamping
__u32*tskey- filled in with next sk_tskey (not for TCP, which uses seqno)
Note
callers should take care of initial*tx_flags value (usually 0)
Parameters
structsock*sk- socket to eat this skb from
structsk_buff*skb- socket buffer to eat
Description
This routine must be called with interrupts disabled or with the socketlocked so that the sk_buff queue operation is ok.
Parameters
structsk_buff*skb- sk_buff to steal the socket from
bool*refcounted- is set to true if the socket is reference-counted
- struct file *
sock_alloc_file(structsocket * sock, int flags, const char * dname)¶ Bind a
socketto afile
Parameters
structsocket*sock- socket
intflags- file status flags
constchar*dnameprotocol name
Returns the
filebound withsock, implicitly storing itin sock->file. If dname isNULL, sets to “”.On failure the return is a ERR pointer (see linux/err.h).This function uses GFP_KERNEL internally.
Parameters
structfile*file- file
int*errpointer to an error code return
On failure returns
NULLand assigns -ENOTSOCK toerr.
Parameters
intfd- file handle
int*errpointer to an error code return
The file handle passed in is locked and the socket it is boundto is returned. If an error occurs the err pointer is overwrittenwith a negative errno code and NULL is returned. The function checksfor both invalid handles and passing a handle which is not a socket.
On a success the socket object pointer is returned.
Parameters
void- no arguments
Description
Allocate a new inode and socket object. The two are bound togetherand initialised. The socket is then returned. If we are out of inodesNULL is returned. This functions uses GFP_KERNEL internally.
Parameters
structsocket*socksocket to close
The socket is released from the protocol stack if it has a releasecallback, and the inode is then released if the socket is bound toan inode not a file.
Parameters
structsocket*sock- socket
structmsghdr*msgmessage to send
Sendsmsg throughsock, passing through LSM.Returns the number of bytes sent, or an error code.
- int
kernel_sendmsg(structsocket * sock, struct msghdr * msg, struct kvec * vec, size_t num, size_t size)¶ send a message throughsock (kernel-space)
Parameters
structsocket*sock- socket
structmsghdr*msg- message header
structkvec*vec- kernel vec
size_tnum- vec array length
size_tsizetotal message data size
Builds the message data withvec and sends it throughsock.Returns the number of bytes sent, or an error code.
- int
kernel_sendmsg_locked(structsock * sk, struct msghdr * msg, struct kvec * vec, size_t num, size_t size)¶ send a message throughsock (kernel-space)
Parameters
structsock*sk- sock
structmsghdr*msg- message header
structkvec*vec- output s/g array
size_tnum- output s/g array length
size_tsizetotal message data size
Builds the message data withvec and sends it throughsock.Returns the number of bytes sent, or an error code.Caller must holdsk.
Parameters
structsocket*sock- socket
structmsghdr*msg- message to receive
intflagsmessage flags
Receivesmsg fromsock, passing through LSM. Returns the total numberof bytes received, or an error.
- int
kernel_recvmsg(structsocket * sock, struct msghdr * msg, struct kvec * vec, size_t num, size_t size, int flags)¶ Receive a message from a socket (kernel space)
Parameters
structsocket*sock- The socket to receive the message from
structmsghdr*msg- Received message
structkvec*vec- Input s/g array for message data
size_tnum- Size of input s/g array
size_tsize- Number of bytes to read
intflagsMessage flags (MSG_DONTWAIT, etc…)
On return the msg structure contains the scatter/gather array passed in thevec argument. The array is modified so that it consists of the unfilledportion of the original array.
The returned value is the total number of bytes received, or an error.
- struct ns_common *
get_net_ns(struct ns_common * ns)¶ increment the refcount of the network namespace
Parameters
structns_common*nscommon namespace (net)
Returns the net’s common namespace.
Parameters
intfamily- protocol family (AF_INET, …)
inttype- communication type (SOCK_STREAM, …)
intprotocol- protocol (0, …)
structsocket**resnew socket
Creates a new socket and assigns it tores, passing through LSM.The new socket initialization is not complete, see
kernel_accept().Returns 0 or an error. On failureres is set toNULL.This function internally uses GFP_KERNEL.
- int
__sock_create(struct net * net, int family, int type, int protocol, structsocket ** res, int kern)¶ creates a socket
Parameters
structnet*net- net namespace
intfamily- protocol family (AF_INET, …)
inttype- communication type (SOCK_STREAM, …)
intprotocol- protocol (0, …)
structsocket**res- new socket
intkernboolean for kernel space sockets
Creates a new socket and assigns it tores, passing through LSM.Returns 0 or an error. On failureres is set to
NULL.kern mustbe set to true if the socket resides in kernel space.This function internally uses GFP_KERNEL.
Parameters
intfamily- protocol family (AF_INET, …)
inttype- communication type (SOCK_STREAM, …)
intprotocol- protocol (0, …)
structsocket**resnew socket
A wrapper around
__sock_create().Returns 0 or an error. This function internally uses GFP_KERNEL.
- int
sock_create_kern(struct net * net, int family, int type, int protocol, structsocket ** res)¶ creates a socket (kernel space)
Parameters
structnet*net- net namespace
intfamily- protocol family (AF_INET, …)
inttype- communication type (SOCK_STREAM, …)
intprotocol- protocol (0, …)
structsocket**resnew socket
A wrapper around
__sock_create().Returns 0 or an error. This function internally uses GFP_KERNEL.
- int
sock_register(const struct net_proto_family * ops)¶ add a socket protocol handler
Parameters
conststructnet_proto_family*opsdescription of protocol
This function is called by a protocol handler that wants toadvertise its address family, and have it linked into thesocket interface. The value ops->family corresponds to thesocket system call protocol family.
- void
sock_unregister(int family)¶ remove a protocol handler
Parameters
intfamilyprotocol family to remove
This function is called by a protocol handler that wants toremove its address family, and have it unlinked from thenew socket creation.
If protocol handler is a module, then it can use module referencecounts to protect against new references. If protocol handler is nota module then it needs to provide its own protection inthe ops->create routine.
- int
kernel_bind(structsocket * sock, struct sockaddr * addr, int addrlen)¶ bind an address to a socket (kernel space)
Parameters
structsocket*sock- socket
structsockaddr*addr- address
intaddrlenlength of address
Returns 0 or an error.
Parameters
structsocket*sock- socket
intbacklogpending connections queue size
Returns 0 or an error.
- int
kernel_accept(structsocket * sock, structsocket ** newsock, int flags)¶ accept a connection (kernel space)
Parameters
structsocket*sock- listening socket
structsocket**newsock- new connected socket
intflagsflags
flags must be SOCK_CLOEXEC, SOCK_NONBLOCK or 0.If it fails,newsock is guaranteed to be
NULL.Returns 0 or an error.
- int
kernel_connect(structsocket * sock, struct sockaddr * addr, int addrlen, int flags)¶ connect a socket (kernel space)
Parameters
structsocket*sock- socket
structsockaddr*addr- address
intaddrlen- address length
intflagsflags (O_NONBLOCK, …)
For datagram sockets,addr is the addres to which datagrams are sentby default, and the only address from which datagrams are received.For stream sockets, attempts to connect toaddr.Returns 0 or an error code.
- int
kernel_getsockname(structsocket * sock, struct sockaddr * addr)¶ get the address which the socket is bound (kernel space)
Parameters
structsocket*sock- socket
structsockaddr*addraddress holder
Fills theaddr pointer with the address which the socket is bound.Returns 0 or an error code.
- int
kernel_getpeername(structsocket * sock, struct sockaddr * addr)¶ get the address which the socket is connected (kernel space)
Parameters
structsocket*sock- socket
structsockaddr*addraddress holder
Fills theaddr pointer with the address which the socket is connected.Returns 0 or an error code.
- int
kernel_sendpage(structsocket * sock, struct page * page, int offset, size_t size, int flags)¶ send a
pagethrough a socket (kernel space)
Parameters
structsocket*sock- socket
structpage*page- page
intoffset- page offset
size_tsize- total size in bytes
intflagsflags (MSG_DONTWAIT, …)
Returns the total amount sent in bytes or an error.
- int
kernel_sendpage_locked(structsock * sk, struct page * page, int offset, size_t size, int flags)¶ send a
pagethrough the locked sock (kernel space)
Parameters
structsock*sk- sock
structpage*page- page
intoffset- page offset
size_tsize- total size in bytes
intflagsflags (MSG_DONTWAIT, …)
Returns the total amount sent in bytes or an error.Caller must holdsk.
- int
kernel_sock_shutdown(structsocket * sock, enumsock_shutdown_cmd how)¶ shut down part of a full-duplex connection (kernel space)
Parameters
structsocket*sock- socket
enumsock_shutdown_cmdhowconnection part
Returns 0 or an error.
Parameters
structsock*sksocket
This routine returns the IP overhead imposed by a socket i.e.the length of the underlying IP header, depending on whetherthis is an IPv4 or IPv6 socket and the length from IP options turnedon at the socket. Assumes that the caller has a lock on the socket.
- structsk_buff *
__alloc_skb(unsigned int size, gfp_t gfp_mask, int flags, int node)¶ allocate a network buffer
Parameters
unsignedintsize- size to allocate
gfp_tgfp_mask- allocation mask
intflags- If SKB_ALLOC_FCLONE is set, allocate from fclone cacheinstead of head cache and allocate a cloned (child) skb.If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used forallocations in case the data is required for writeback
intnodenuma node to allocate memory on
Allocate a new
sk_buff. The returned buffer has no headroom and atail room of at least size bytes. The object has a reference countof one. The return is the buffer. On a failure the return isNULL.Buffers may only be allocated from interrupts using agfp_mask of
GFP_ATOMIC.
- structsk_buff *
build_skb_around(structsk_buff * skb, void * data, unsigned int frag_size)¶ build a network buffer around provided skb
Parameters
structsk_buff*skb- sk_buff provide by caller, must be memset cleared
void*data- data buffer provided by caller
unsignedintfrag_size- size of data, or 0 if head was kmalloced
- void *
netdev_alloc_frag(unsigned int fragsz)¶ allocate a page fragment
Parameters
unsignedintfragsz- fragment size
Description
Allocates a frag from a page for receive buffer.Uses GFP_ATOMIC allocations.
- structsk_buff *
__netdev_alloc_skb(structnet_device * dev, unsigned int len, gfp_t gfp_mask)¶ allocate an skbuff for rx on a specific device
Parameters
structnet_device*dev- network device to receive on
unsignedintlen- length to allocate
gfp_tgfp_maskget_free_pages mask, passed to alloc_skb
Allocate a new
sk_buffand assign it a usage count of one. Thebuffer has NET_SKB_PAD headroom built in. Users should allocatethe headroom they think they need without accounting for thebuilt in space. The built in space is used for optimisations.NULLis returned if there is no free memory.
- structsk_buff *
__napi_alloc_skb(struct napi_struct * napi, unsigned int len, gfp_t gfp_mask)¶ allocate skbuff for rx in a specific NAPI instance
Parameters
structnapi_struct*napi- napi instance this buffer was allocated for
unsignedintlen- length to allocate
gfp_tgfp_maskget_free_pages mask, passed to alloc_skb and alloc_pages
Allocate a new sk_buff for use in NAPI receive. This buffer willattempt to allocate the head from a special reserved region usedonly for NAPI Rx allocation. By doing this we can save severalCPU cycles by avoiding having to disable and re-enable IRQs.
NULLis returned if there is no free memory.
Parameters
structsk_buff*skbbuffer
Free an sk_buff. Release anything attached to the buffer.Clean the state. This is an internal helper function. Users shouldalways call kfree_skb
Parameters
structsk_buff*skbbuffer to free
Drop a reference to the buffer and free it if the usage count hashit zero.
Parameters
structsk_buff*skbbuffer that triggered an error
Report xmit error if a device callback is tracking this skb.skb must be freed afterwards.
Parameters
structsk_buff*skbbuffer to free
Drop a ref to the buffer and free it if the usage count has hit zeroFunctions identically to kfree_skb, but kfree_skb assumes that the frameis being dropped after a failure and notes that
- structsk_buff *
alloc_skb_for_msg(structsk_buff * first)¶ allocate sk_buff to wrap frag list forming a msg
Parameters
structsk_buff*first- first sk_buff of the msg
Parameters
structsk_buff*dst- the skb to receive the contents
structsk_buff*srcthe skb to supply the contents
This is identical to skb_clone except that the target skb issupplied by the user.
The target skb is returned upon exit.
Parameters
structsk_buff*skb- the skb to modify
gfp_tgfp_maskallocation priority
This must be called on SKBTX_DEV_ZEROCOPY skb.It will copy all frags into kernel and drop the referenceto userspace pages.
If this function is called from an interrupt gfp_mask() must be
GFP_ATOMIC.Returns 0 on success or a negative error code on failureto allocate kernel memory to copy to.
Parameters
structsk_buff*skb- buffer to clone
gfp_tgfp_maskallocation priority
Duplicate an
sk_buff. The new one is not owned by a socket. Bothcopies share the same packet data but not structure. The newbuffer has a reference count of 1. If the allocation fails thefunction returnsNULLotherwise the new buffer is returned.If this function is called from an interrupt gfp_mask() must be
GFP_ATOMIC.
- structsk_buff *
skb_copy(const structsk_buff * skb, gfp_t gfp_mask)¶ create private copy of an sk_buff
Parameters
conststructsk_buff*skb- buffer to copy
gfp_tgfp_maskallocation priority
Make a copy of both an
sk_buffand its data. This is used when thecaller wishes to modify the data and needs a private copy of thedata to alter. ReturnsNULLon failure or the pointer to the bufferon success. The returned buffer has a reference count of 1.As by-product this function converts non-linear
sk_buffto linearone, so thatsk_buffbecomes completely private and caller is allowedto modify all the data of returned buffer. This means that thisfunction is not recommended for use in circumstances when onlyheader is going to be modified. Use pskb_copy() instead.
- structsk_buff *
__pskb_copy_fclone(structsk_buff * skb, int headroom, gfp_t gfp_mask, bool fclone)¶ create copy of an sk_buff with private head.
Parameters
structsk_buff*skb- buffer to copy
intheadroom- headroom of new skb
gfp_tgfp_mask- allocation priority
boolfcloneif true allocate the copy of the skb from the fclonecache instead of the head cache; it is recommended to set thisto true for the cases where the copy will likely be cloned
Make a copy of both an
sk_buffand part of its data, locatedin header. Fragmented data remain shared. This is used whenthe caller wishes to modify only header ofsk_buffand needsprivate copy of the header to alter. ReturnsNULLon failureor the pointer to the buffer on success.The returned buffer has a reference count of 1.
- int
pskb_expand_head(structsk_buff * skb, int nhead, int ntail, gfp_t gfp_mask)¶ reallocate header of
sk_buff
Parameters
structsk_buff*skb- buffer to reallocate
intnhead- room to add at head
intntail- room to add at tail
gfp_tgfp_maskallocation priority
Expands (or creates identical copy, ifnhead andntail are zero)header ofskb.
sk_buffitself is not changed.sk_buffMUST havereference count of 1. Returns zero in the case of success or error,if expansion failed. In the last case,sk_buffis not changed.All the pointers pointing into skb header may change and must bereloaded after call to this function.
- structsk_buff *
skb_copy_expand(const structsk_buff * skb, int newheadroom, int newtailroom, gfp_t gfp_mask)¶ copy and expand sk_buff
Parameters
conststructsk_buff*skb- buffer to copy
intnewheadroom- new free bytes at head
intnewtailroom- new free bytes at tail
gfp_tgfp_maskallocation priority
Make a copy of both an
sk_buffand its data and while doing soallocate additional space.This is used when the caller wishes to modify the data and needs aprivate copy of the data to alter as well as more space for new fields.Returns
NULLon failure or the pointer to the bufferon success. The returned buffer has a reference count of 1.You must pass
GFP_ATOMICas the allocation priority if this functionis called from an interrupt.
Parameters
structsk_buff*skb- buffer to pad
intpad- space to pad
boolfree_on_errorfree buffer on error
Ensure that a buffer is followed by a padding area that is zerofilled. Used by network drivers which may DMA or transfer databeyond the buffer end onto the wire.
May return error in out of memory cases. The skb is freed on erroriffree_on_error is true.
- void *
pskb_put(structsk_buff * skb, structsk_buff * tail, int len)¶ add data to the tail of a potentially fragmented buffer
Parameters
structsk_buff*skb- start of the buffer to use
structsk_buff*tail- tail fragment of the buffer to use
intlenamount of data to add
This function extends the used data area of the potentiallyfragmented buffer.tail must be the last fragment ofskb – orskb itself. If this would exceed the total buffer size the kernelwill panic. A pointer to the first byte of the extra data isreturned.
Parameters
structsk_buff*skb- buffer to use
unsignedintlenamount of data to add
This function extends the used data area of the buffer. If this wouldexceed the total buffer size the kernel will panic. A pointer to thefirst byte of the extra data is returned.
Parameters
structsk_buff*skb- buffer to use
unsignedintlenamount of data to add
This function extends the used data area of the buffer at the bufferstart. If this would exceed the total buffer headroom the kernel willpanic. A pointer to the first byte of the extra data is returned.
Parameters
structsk_buff*skb- buffer to use
unsignedintlenamount of data to remove
This function removes data from the start of a buffer, returningthe memory to the headroom. A pointer to the next data in the bufferis returned. Once the data has been pulled future pushes will overwritethe old data.
Parameters
structsk_buff*skb- buffer to alter
unsignedintlennew length
Cut the length of a buffer down by removing data from the tail. Ifthe buffer is already under the length specified it is not modified.The skb must be linear.
Parameters
structsk_buff*skb- buffer to reallocate
intdeltanumber of bytes to advance tail
The function makes a sense only on a fragmented
sk_buff,it expands header moving its tail forward and copying necessarydata from fragmented part.sk_buffMUST have reference count of 1.Returns
NULL(andsk_buffdoes not change) if pull failedor value of new tail of skb in the case of success.All the pointers pointing into skb header may change and must bereloaded after call to this function.
- int
skb_copy_bits(const structsk_buff * skb, int offset, void * to, int len)¶ copy bits from skb to kernel buffer
Parameters
conststructsk_buff*skb- source skb
intoffset- offset in source
void*to- destination buffer
intlennumber of bytes to copy
Copy the specified number of bytes from the source skb to thedestination buffer.
- CAUTION ! :
- If its prototype is ever changed,check arch/{*}/net/{*}.S files,since it is called from BPF assembly code.
- int
skb_store_bits(structsk_buff * skb, int offset, const void * from, int len)¶ store bits from kernel buffer to skb
Parameters
structsk_buff*skb- destination buffer
intoffset- offset in destination
constvoid*from- source buffer
intlennumber of bytes to copy
Copy the specified number of bytes from the source buffer to thedestination skb. This function handles all the messy bits oftraversing fragment lists and such.
Parameters
structsk_buff*to- destination buffer
structsk_buff*from- source buffer
intlen- number of bytes to copy from source buffer
inthlensize of linear headroom in destination buffer
Copies up tolen bytes fromfrom toto by creating referencesto the frags in the source buffer.
Thehlen as calculated by skb_zerocopy_headlen() specifies theheadroom in theto buffer.
Return value:0: everything is OK-ENOMEM: couldn’t orphan frags offrom due to lack of memory-EFAULT:
skb_copy_bits()found some problem with skb geometry
Parameters
structsk_buff_head*listlist to dequeue from
Remove the head of the list. The list lock is taken so the functionmay be used safely with other locking list functions. The head item isreturned or
NULLif the list is empty.
Parameters
structsk_buff_head*listlist to dequeue from
Remove the tail of the list. The list lock is taken so the functionmay be used safely with other locking list functions. The tail item isreturned or
NULLif the list is empty.
- void
skb_queue_purge(struct sk_buff_head * list)¶ empty a list
Parameters
structsk_buff_head*listlist to empty
Delete all buffers on an
sk_bufflist. Each buffer is removed fromthe list and one reference dropped. This function takes the listlock and is atomic with respect to other list locking functions.
- void
skb_queue_head(struct sk_buff_head * list, structsk_buff * newsk)¶ queue a buffer at the list head
Parameters
structsk_buff_head*list- list to use
structsk_buff*newskbuffer to queue
Queue a buffer at the start of the list. This function takes thelist lock and can be used safely with other locking
sk_bufffunctionssafely.A buffer cannot be placed on two lists at the same time.
- void
skb_queue_tail(struct sk_buff_head * list, structsk_buff * newsk)¶ queue a buffer at the list tail
Parameters
structsk_buff_head*list- list to use
structsk_buff*newskbuffer to queue
Queue a buffer at the tail of the list. This function takes thelist lock and can be used safely with other locking
sk_bufffunctionssafely.A buffer cannot be placed on two lists at the same time.
Parameters
structsk_buff*skb- buffer to remove
structsk_buff_head*listlist to use
Remove a packet from a list. The list locks are taken and thisfunction is atomic with respect to other list locked calls
You must know what list the SKB is on.
- void
skb_append(structsk_buff * old, structsk_buff * newsk, struct sk_buff_head * list)¶ append a buffer
Parameters
structsk_buff*old- buffer to insert after
structsk_buff*newsk- buffer to insert
structsk_buff_head*listlist to use
Place a packet after a given packet in a list. The list locks are takenand this function is atomic with respect to other list locked calls.A buffer cannot be placed on two lists at the same time.
- void
skb_split(structsk_buff * skb, structsk_buff * skb1, const u32 len)¶ Split fragmented skb to two parts at length len.
Parameters
structsk_buff*skb- the buffer to split
structsk_buff*skb1- the buffer to receive the second part
constu32len- new length for skb
- void
skb_prepare_seq_read(structsk_buff * skb, unsigned int from, unsigned int to, struct skb_seq_state * st)¶ Prepare a sequential read of skb data
Parameters
structsk_buff*skb- the buffer to read
unsignedintfrom- lower offset of data to be read
unsignedintto- upper offset of data to be read
structskb_seq_state*st- state variable
Description
Initializes the specified state variable. Must be called beforeinvokingskb_seq_read() for the first time.
- unsigned int
skb_seq_read(unsigned int consumed, const u8 ** data, struct skb_seq_state * st)¶ Sequentially read skb data
Parameters
unsignedintconsumed- number of bytes consumed by the caller so far
constu8**data- destination pointer for data to be returned
structskb_seq_state*st- state variable
Description
Reads a block of skb data atconsumed relative to thelower offset specified toskb_prepare_seq_read(). Assignsthe head of the data block todata and returns the lengthof the block or 0 if the end of the skb data or the upperoffset has been reached.
The caller is not required to consume all of the datareturned, i.e.consumed is typically set to the numberof bytes already consumed and the next call toskb_seq_read() will return the remaining part of the block.
- Note 1: The size of each block of data returned can be arbitrary,
- this limitation is the cost for zerocopy sequentialreads of potentially non linear data.
- Note 2: Fragment lists within fragments are not implemented
- at the moment, state->root_skb could be replaced witha stack for this purpose.
- void
skb_abort_seq_read(struct skb_seq_state * st)¶ Abort a sequential read of skb data
Parameters
structskb_seq_state*st- state variable
Description
Must be called ifskb_seq_read() was not called until itreturned 0.
- unsigned int
skb_find_text(structsk_buff * skb, unsigned int from, unsigned int to, struct ts_config * config)¶ Find a text pattern in skb data
Parameters
structsk_buff*skb- the buffer to look in
unsignedintfrom- search offset
unsignedintto- search limit
structts_config*config- textsearch configuration
Description
Finds a pattern in the skb data according to the specifiedtextsearch configuration. Usetextsearch_next() to retrievesubsequent occurrences of the pattern. Returns the offsetto the first occurrence or UINT_MAX if no match was found.
Parameters
structsk_buff*skb- buffer to update
unsignedintlenlength of data pulled
This function performs an skb_pull on the packet and updatesthe CHECKSUM_COMPLETE checksum. It should be used onreceive path processing instead of skb_pull unless you knowthat the checksum difference is zero (e.g., a valid IP header)or you are setting ip_summed to CHECKSUM_NONE.
- structsk_buff *
skb_segment(structsk_buff * head_skb, netdev_features_t features)¶ Perform protocol segmentation on skb.
Parameters
structsk_buff*head_skb- buffer to segment
netdev_features_tfeaturesfeatures for the output path (see dev->features)
This function performs segmentation on the given skb. It returnsa pointer to the first in a list of new skbs for the segments.In case of error it returns ERR_PTR(err).
- int
skb_to_sgvec(structsk_buff * skb, struct scatterlist * sg, int offset, int len)¶ Fill a scatter-gather list from a socket buffer
Parameters
structsk_buff*skb- Socket buffer containing the buffers to be mapped
structscatterlist*sg- The scatter-gather list to map into
intoffset- The offset into the buffer’s contents to start mapping
intlenLength of buffer space to be mapped
Fill the specified scatter-gather list with mappings/pointers into aregion of the buffer space attached to a socket buffer. Returns eitherthe number of scatterlist items used, or -EMSGSIZE if the contentscould not fit.
- int
skb_cow_data(structsk_buff * skb, int tailbits, structsk_buff ** trailer)¶ Check that a socket buffer’s data buffers are writable
Parameters
structsk_buff*skb- The socket buffer to check.
inttailbits- Amount of trailing space to be added
structsk_buff**trailerReturned pointer to the skb where thetailbits space begins
Make sure that the data buffers attached to a socket buffer arewritable. If they are not, private copies are made of the data buffersand the socket buffer is set to use these instead.
Iftailbits is given, make sure that there is space to writetailbitsbytes of data beyond current end of socket buffer.trailer will beset to point to the skb in which this space begins.
The number of scatterlist elements required to completely map theCOW’d and extended socket buffer will be returned.
Parameters
structsk_buff*skb- the skb to clone
Description
This function creates a clone of a buffer that holds a reference onsk_refcnt. Buffers created via this function are meant to bereturned using sock_queue_err_skb, or free via kfree_skb.
When passing buffers allocated with this function to sock_queue_err_skbit is necessary to wrap the call with sock_hold/sock_put in order toprevent the socket from being released prior to being enqueued onthe sk_error_queue.
- bool
skb_partial_csum_set(structsk_buff * skb, u16 start, u16 off)¶ set up and verify partial csum values for packet
Parameters
structsk_buff*skb- the skb to set
u16start- the number of bytes after skb->data to start checksumming.
u16off- the offset from start to place the checksum.
Description
For untrusted partially-checksummed packets, we need to make sure the valuesfor skb->csum_start and skb->csum_offset are valid so we don’t oops.
This function checks and sets those values and skb->ip_summed: if thisreturns false you should drop the packet.
Parameters
structsk_buff*skb- the skb to set up
boolrecalculate- if true the pseudo-header checksum will be recalculated
- structsk_buff *
skb_checksum_trimmed(structsk_buff * skb, unsigned int transport_len, __sum16(*skb_chkf)(structsk_buff *skb))¶ validate checksum of an skb
Parameters
structsk_buff*skb- the skb to check
unsignedinttransport_len- the data length beyond the network header
__sum16(*)(structsk_buff*skb)skb_chkf- checksum function to use
Description
Applies the given checksum function skb_chkf to the provided skb.Returns a checked and maybe trimmed skb. Returns NULL on error.
If the skb has data beyond the given transport length, then atrimmed & cloned skb is checked and returned.
Caller needs to set the skb transport header and free any returned skb if itdiffers from the provided skb.
- bool
skb_try_coalesce(structsk_buff * to, structsk_buff * from, bool * fragstolen, int * delta_truesize)¶ try to merge skb to prior one
Parameters
structsk_buff*to- prior buffer
structsk_buff*from- buffer to add
bool*fragstolen- pointer to boolean
int*delta_truesize- how much more was allocated than was requested
Parameters
structsk_buff*skb- buffer to clean
boolxnet- packet is crossing netns
Description
skb_scrub_packet can be used after encapsulating or decapsulting a packetinto/from a tunnel. Some information have to be cleared during theseoperations.skb_scrub_packet can also be used to clean a skb before injecting it inanother namespace (xnet == true). We have to clear all information in theskb that could impact namespace isolation.
- bool
skb_gso_validate_network_len(const structsk_buff * skb, unsigned int mtu)¶ Will a split GSO skb fit into a given MTU?
Parameters
conststructsk_buff*skb- GSO skb
unsignedintmtu- MTU to validate against
Description
skb_gso_validate_network_len validates if a given skb will fit awanted MTU once split. It considers L3 headers, L4 headers, and thepayload.
- bool
skb_gso_validate_mac_len(const structsk_buff * skb, unsigned int len)¶ Will a split GSO skb fit in a given length?
Parameters
conststructsk_buff*skb- GSO skb
unsignedintlen- length to validate against
Description
skb_gso_validate_mac_len validates if a given skb will fit a wantedlength once split, including L2, L3 and L4 headers and the payload.
- int
skb_mpls_push(structsk_buff * skb, __be32 mpls_lse, __be16 mpls_proto, int mac_len, bool ethernet)¶ push a new MPLS header after mac_len bytes from start of the packet
Parameters
structsk_buff*skb- buffer
__be32mpls_lse- MPLS label stack entry to push
__be16mpls_proto- ethertype of the new MPLS header (expects 0x8847 or 0x8848)
intmac_len- length of the MAC header
boolethernet- flag to indicate if the resulting packet after skb_mpls_push isethernet
Description
Expects skb->data at mac header.
Returns 0 on success, -errno otherwise.
- int
skb_mpls_pop(structsk_buff * skb, __be16 next_proto, int mac_len, bool ethernet)¶ pop the outermost MPLS header
Parameters
structsk_buff*skb- buffer
__be16next_proto- ethertype of header after popped MPLS header
intmac_len- length of the MAC header
boolethernet- flag to indicate if the packet is ethernet
Description
Expects skb->data at mac header.
Returns 0 on success, -errno otherwise.
- int
skb_mpls_update_lse(structsk_buff * skb, __be32 mpls_lse)¶ modify outermost MPLS header and update csum
Parameters
structsk_buff*skb- buffer
__be32mpls_lse- new MPLS label stack entry to update to
Description
Expects skb->data at mac header.
Returns 0 on success, -errno otherwise.
Parameters
structsk_buff*skb- buffer
Description
Expects skb->data at mac header.
Returns 0 on success, -errno otherwise.
- structsk_buff *
alloc_skb_with_frags(unsigned long header_len, unsigned long data_len, int max_page_order, int * errcode, gfp_t gfp_mask)¶ allocate skb with page frags
Parameters
unsignedlongheader_len- size of linear part
unsignedlongdata_len- needed length in frags
intmax_page_order- max page order desired.
int*errcode- pointer to error code if any
gfp_tgfp_mask- allocation mask
Description
This can be used to allocate a paged skb, given a maximal order for frags.
- void *
skb_ext_add(structsk_buff * skb, enum skb_ext_id id)¶ allocate space for given extension, COW if needed
Parameters
structsk_buff*skb- buffer
enumskb_ext_idid- extension to allocate space for
Description
Allocates enough space for the given extension.If the extension is already present, a pointer to that extensionis returned.
If the skb was cloned, COW applies and the returned memory can bemodified without changing the extension space of clones buffers.
Returns pointer to the extension or NULL on allocation failure.
- bool
sk_ns_capable(const structsock * sk, struct user_namespace * user_ns, int cap)¶ General socket capability test
Parameters
conststructsock*sk- Socket to use a capability on or through
structuser_namespace*user_ns- The user namespace of the capability to use
intcap- The capability to use
Description
Test to see if the opener of the socket had when the socket wascreated and the current process has the capabilitycap in the usernamespaceuser_ns.
Parameters
conststructsock*sk- Socket to use a capability on or through
intcap- The global capability to use
Description
Test to see if the opener of the socket had when the socket wascreated and the current process has the capabilitycap in all usernamespaces.
Parameters
conststructsock*sk- Socket to use a capability on or through
intcap- The capability to use
Description
Test to see if the opener of the socket had when the socket was createdand the current process has the capabilitycap over the network namespacethe socket is a member of.
Parameters
structsock*sk- socket to set it on
Description
SetSOCK_MEMALLOC on a socket for access to emergency reserves.It’s the responsibility of the admin to adjust min_free_kbytesto meet the requirements
- structsock *
sk_alloc(struct net * net, int family, gfp_t priority, struct proto * prot, int kern)¶ All socket objects are allocated here
Parameters
structnet*net- the applicable net namespace
intfamily- protocol family
gfp_tpriority- for allocation (
GFP_KERNEL,GFP_ATOMIC, etc) structproto*prot- struct proto associated with this new sock instance
intkern- is this to be a kernel socket?
- structsock *
sk_clone_lock(const structsock * sk, const gfp_t priority)¶ clone a socket, and lock its clone
Parameters
conststructsock*sk- the socket to clone
constgfp_tpriorityfor allocation (
GFP_KERNEL,GFP_ATOMIC, etc)Caller must unlock socket even in error path (bh_unlock_sock(newsk))
- bool
skb_page_frag_refill(unsigned int sz, struct page_frag * pfrag, gfp_t gfp)¶ check that a page_frag contains enough room
Parameters
unsignedintsz- minimum size of the fragment we want to get
structpage_frag*pfrag- pointer to page_frag
gfp_tgfp- priority for memory allocation
Note
While this allocator tries to use high order pages, there isno guarantee that allocations succeed. Therefore,sz MUST beless or equal than PAGE_SIZE.
- int
sk_wait_data(structsock * sk, long * timeo, const structsk_buff * skb)¶ wait for data to arrive at sk_receive_queue
Parameters
structsock*sk- sock to wait on
long*timeo- for how long
conststructsk_buff*skb- last skb seen on sk_receive_queue
Description
Now socket state including sk->sk_err is changed only under lock,hence we may omit checks after joining wait queue.We check receive queue before schedule() only as optimization;it is very likely that release_sock() added new data.
Parameters
structsock*sk- socket
intsize- memory size to allocate
intamt- pages to allocate
intkindallocation type
Similar to
__sk_mem_schedule(), but does not update sk_forward_alloc
- int
__sk_mem_schedule(structsock * sk, int size, int kind)¶ increase sk_forward_alloc and memory_allocated
Parameters
structsock*sk- socket
intsize- memory size to allocate
intkindallocation type
If kind is SK_MEM_SEND, it means wmem allocation. Otherwise it meansrmem allocation. This function assumes that protocols which havememory_pressure use sk_wmem_queued as write buffer accounting.
Parameters
structsock*sk- socket
intamountnumber of quanta
Similar to
__sk_mem_reclaim(), but does not update sk_forward_alloc
Parameters
structsock*sk- socket
intamount- number of bytes (rounded down to a SK_MEM_QUANTUM multiple)
Parameters
structsock*sk- socket
Description
This version should be used for very small section, where process wont blockreturn false if fast path is taken:
sk_lock.slock locked, owned = 0, BH disabled
return true if slow path is taken:
sk_lock.slock unlocked, owned = 1, BH enabled
- structsk_buff *
__skb_try_recv_datagram(structsock * sk, struct sk_buff_head * queue, unsigned int flags, int * off, int * err, structsk_buff ** last)¶ Receive a datagram skbuff
Parameters
structsock*sk- socket
structsk_buff_head*queue- socket queue from which to receive
unsignedintflags- MSG_ flags
int*off- an offset in bytes to peek skb from. Returns an offsetwithin an skb where data actually starts
int*err- error code returned
structsk_buff**lastset to last peeked message to inform the wait functionwhat to look for when peeking
Get a datagram skbuff, understands the peeking, nonblocking wakeupsand possible races. This replaces identical code in packet, raw andudp, as well as the IPX AX.25 and Appletalk. It also finally fixesthe long standing peek and read race for datagram sockets. If youalter this routine remember it must be re-entrant.
This function will lock the socket if a skb is returned, sothe caller needs to unlock the socket in that case (usually bycalling skb_free_datagram). Returns NULL witherr set to-EAGAIN if no data was available or to some other value if anerror was detected.
- It does not lock socket since today. This function is
- free of race conditions. This measure should/can improve
- significantly datagram socket latencies at high loads,
- when data copying to user space takes lots of time.
- (BTW I’ve just killed the last cli() in IP/IPv6/core/netlink/packet
- Great win.)
- –ANK (980729)
The order of the tests when we find no data waiting are specifiedquite explicitly by POSIX 1003.1g, don’t change them without havingthe standard around please.
- int
skb_kill_datagram(structsock * sk, structsk_buff * skb, unsigned int flags)¶ Free a datagram skbuff forcibly
Parameters
structsock*sk- socket
structsk_buff*skb- datagram skbuff
unsignedintflagsMSG_ flags
This function frees a datagram skbuff that was received byskb_recv_datagram. The flags argument must match the oneused for skb_recv_datagram.
If the MSG_PEEK flag is set, and the packet is still on thereceive queue of the socket, it will be taken off the queuebefore it is freed.
This function currently only disables BH when acquiring thesk_receive_queue lock. Therefore it must not be used in acontext where that lock is acquired in an IRQ context.
It returns 0 if the packet was removed by us.
- int
skb_copy_and_hash_datagram_iter(const structsk_buff * skb, int offset, struct iov_iter * to, int len, struct ahash_request * hash)¶ Copy datagram to an iovec iterator and update a hash.
Parameters
conststructsk_buff*skb- buffer to copy
intoffset- offset in the buffer to start copying from
structiov_iter*to- iovec iterator to copy to
intlen- amount of data to copy from buffer to iovec
structahash_request*hash- hash request to update
- int
skb_copy_datagram_iter(const structsk_buff * skb, int offset, struct iov_iter * to, int len)¶ Copy a datagram to an iovec iterator.
Parameters
conststructsk_buff*skb- buffer to copy
intoffset- offset in the buffer to start copying from
structiov_iter*to- iovec iterator to copy to
intlen- amount of data to copy from buffer to iovec
- int
skb_copy_datagram_from_iter(structsk_buff * skb, int offset, struct iov_iter * from, int len)¶ Copy a datagram from an iov_iter.
Parameters
structsk_buff*skb- buffer to copy
intoffset- offset in the buffer to start copying to
structiov_iter*from- the copy source
intlenamount of data to copy to buffer from iovec
Returns 0 or -EFAULT.
- int
zerocopy_sg_from_iter(structsk_buff * skb, struct iov_iter * from)¶ Build a zerocopy datagram from an iov_iter
Parameters
structsk_buff*skb- buffer to copy
structiov_iter*fromthe source to copy from
The function will first copy up to headlen, and then pin the userspacepages and build frags through them.
Returns 0, -EFAULT or -EMSGSIZE.
- int
skb_copy_and_csum_datagram_msg(structsk_buff * skb, int hlen, struct msghdr * msg)¶ Copy and checksum skb to user iovec.
Parameters
structsk_buff*skb- skbuff
inthlen- hardware length
structmsghdr*msgdestination
Caller _must_ check that skb will fit to this iovec.
Return
- 0 - success.
- -EINVAL - checksum failure.-EFAULT - fault during copy.
- __poll_t
datagram_poll(struct file * file, structsocket * sock, poll_table * wait)¶ generic datagram poll
Parameters
structfile*file- file struct
structsocket*sock- socket
poll_table*waitpoll table
Datagram poll: Again totally generic. This also handlessequenced packet sockets providing the socket receive queueis only ever holding data ready to receive.
Note
- when youdon’t use this routine for this protocol,
- and you use a different write policy from sock_writeable()then please supply your own write_space callback.
- int
sk_stream_wait_connect(structsock * sk, long * timeo_p)¶ Wait for a socket to get into the connected state
Parameters
structsock*sk- sock to wait on
long*timeo_p- for how long to wait
Description
Must be called with the socket locked.
Parameters
structsock*sk- socket to wait for memory
long*timeo_p- for how long
Socket Filter¶
- int
sk_filter_trim_cap(structsock * sk, structsk_buff * skb, unsigned int cap)¶ run a packet through a socket filter
Parameters
structsock*sk- sock associated with
sk_buff structsk_buff*skb- buffer to filter
unsignedintcap- limit on how short the eBPF program may trim the packet
Description
Run the eBPF program and then cut skb->data to correct size returned bythe program. If pkt_len is 0 we toss packet. If skb->len is smallerthan pkt_len we keep whole skb->data. This is the socket levelwrapper to BPF_PROG_RUN. It returns 0 if the packet shouldbe accepted or -EPERM if the packet should be tossed.
- int
bpf_prog_create(struct bpf_prog ** pfp, struct sock_fprog_kern * fprog)¶ create an unattached filter
Parameters
structbpf_prog**pfp- the unattached filter that is created
structsock_fprog_kern*fprog- the filter program
Description
Create a filter independent of any socket. We first run somesanity checks on it to make sure it does not explode on us later.If an error occurs or there is insufficient memory for the filtera negative errno code is returned. On success the return is zero.
- int
bpf_prog_create_from_user(struct bpf_prog ** pfp, struct sock_fprog * fprog, bpf_aux_classic_check_t trans, bool save_orig)¶ create an unattached filter from user buffer
Parameters
structbpf_prog**pfp- the unattached filter that is created
structsock_fprog*fprog- the filter program
bpf_aux_classic_check_ttrans- post-classic verifier transformation handler
boolsave_orig- save classic BPF program
Description
This function effectively does the same asbpf_prog_create(), onlythat it builds up its insns buffer from user space provided buffer.It also allows for passing a bpf_aux_classic_check_t handler.
Parameters
structsock_fprog*fprog- the filter program
structsock*sk- the socket to use
Description
Attach the user’s filter code. We first run some sanity checks onit to make sure it does not explode on us later. If an erroroccurs or there is insufficient memory for the filter a negativeerrno code is returned. On success the return is zero.
Generic Network Statistics¶
- struct
gnet_stats_basic¶ byte/packet throughput statistics
Definition
struct gnet_stats_basic { __u64 bytes; __u32 packets;};Members
bytes- number of seen bytes
packets- number of seen packets
- struct
gnet_stats_rate_est¶ rate estimator
Definition
struct gnet_stats_rate_est { __u32 bps; __u32 pps;};Members
bps- current byte rate
pps- current packet rate
- struct
gnet_stats_rate_est64¶ rate estimator
Definition
struct gnet_stats_rate_est64 { __u64 bps; __u64 pps;};Members
bps- current byte rate
pps- current packet rate
- struct
gnet_stats_queue¶ queuing statistics
Definition
struct gnet_stats_queue { __u32 qlen; __u32 backlog; __u32 drops; __u32 requeues; __u32 overlimits;};Members
qlen- queue length
backlog- backlog size of queue
drops- number of dropped packets
requeues- number of requeues
overlimits- number of enqueues over the limit
- struct
gnet_estimator¶ rate estimator configuration
Definition
struct gnet_estimator { signed char interval; unsigned char ewma_log;};Members
interval- sampling period
ewma_log- the log of measurement window weight
- int
gnet_stats_start_copy_compat(structsk_buff * skb, int type, int tc_stats_type, int xstats_type, spinlock_t * lock, struct gnet_dump * d, int padattr)¶ start dumping procedure in compatibility mode
Parameters
structsk_buff*skb- socket buffer to put statistics TLVs into
inttype- TLV type for top level statistic TLV
inttc_stats_type- TLV type for backward compatibility struct tc_stats TLV
intxstats_type- TLV type for backward compatibility xstats TLV
spinlock_t*lock- statistics lock
structgnet_dump*d- dumping handle
intpadattr- padding attribute
Description
Initializes the dumping handle, grabs the statistic lock and appendsan empty TLV header to the socket buffer for use a container for allother statistic TLVS.
The dumping handle is marked to be in backward compatibility mode tellingall gnet_stats_copy_XXX() functions to fill a local copy of struct tc_stats.
Returns 0 on success or -1 if the room in the socket buffer was not sufficient.
- int
gnet_stats_start_copy(structsk_buff * skb, int type, spinlock_t * lock, struct gnet_dump * d, int padattr)¶ start dumping procedure in compatibility mode
Parameters
structsk_buff*skb- socket buffer to put statistics TLVs into
inttype- TLV type for top level statistic TLV
spinlock_t*lock- statistics lock
structgnet_dump*d- dumping handle
intpadattr- padding attribute
Description
Initializes the dumping handle, grabs the statistic lock and appendsan empty TLV header to the socket buffer for use a container for allother statistic TLVS.
Returns 0 on success or -1 if the room in the socket buffer was not sufficient.
- int
gnet_stats_copy_basic(const seqcount_t * running, struct gnet_dump * d, struct gnet_stats_basic_cpu __percpu * cpu, struct gnet_stats_basic_packed * b)¶ copy basic statistics into statistic TLV
Parameters
constseqcount_t*running- seqcount_t pointer
structgnet_dump*d- dumping handle
structgnet_stats_basic_cpu__percpu*cpu- copy statistic per cpu
structgnet_stats_basic_packed*b- basic statistics
Description
Appends the basic statistics to the top level TLV created bygnet_stats_start_copy().
Returns 0 on success or -1 with the statistic lock releasedif the room in the socket buffer was not sufficient.
- int
gnet_stats_copy_basic_hw(const seqcount_t * running, struct gnet_dump * d, struct gnet_stats_basic_cpu __percpu * cpu, struct gnet_stats_basic_packed * b)¶ copy basic hw statistics into statistic TLV
Parameters
constseqcount_t*running- seqcount_t pointer
structgnet_dump*d- dumping handle
structgnet_stats_basic_cpu__percpu*cpu- copy statistic per cpu
structgnet_stats_basic_packed*b- basic statistics
Description
Appends the basic statistics to the top level TLV created bygnet_stats_start_copy().
Returns 0 on success or -1 with the statistic lock releasedif the room in the socket buffer was not sufficient.
- int
gnet_stats_copy_rate_est(struct gnet_dump * d, struct net_rate_estimator __rcu ** rate_est)¶ copy rate estimator statistics into statistics TLV
Parameters
structgnet_dump*d- dumping handle
structnet_rate_estimator__rcu**rate_est- rate estimator
Description
Appends the rate estimator statistics to the top level TLV created bygnet_stats_start_copy().
Returns 0 on success or -1 with the statistic lock releasedif the room in the socket buffer was not sufficient.
- int
gnet_stats_copy_queue(struct gnet_dump * d, structgnet_stats_queue __percpu * cpu_q, structgnet_stats_queue * q, __u32 qlen)¶ copy queue statistics into statistics TLV
Parameters
structgnet_dump*d- dumping handle
structgnet_stats_queue__percpu*cpu_q- per cpu queue statistics
structgnet_stats_queue*q- queue statistics
__u32qlen- queue length statistics
Description
Appends the queue statistics to the top level TLV created bygnet_stats_start_copy(). Using per cpu queue statistics ifthey are available.
Returns 0 on success or -1 with the statistic lock releasedif the room in the socket buffer was not sufficient.
- int
gnet_stats_copy_app(struct gnet_dump * d, void * st, int len)¶ copy application specific statistics into statistics TLV
Parameters
structgnet_dump*d- dumping handle
void*st- application specific statistics data
intlen- length of data
Description
Appends the application specific statistics to the top level TLV created bygnet_stats_start_copy() and remembers the data for XSTATS if the dumpinghandle is in backward compatibility mode.
Returns 0 on success or -1 with the statistic lock releasedif the room in the socket buffer was not sufficient.
- int
gnet_stats_finish_copy(struct gnet_dump * d)¶ finish dumping procedure
Parameters
structgnet_dump*d- dumping handle
Description
Corrects the length of the top level TLV to include all TLVs addedby gnet_stats_copy_XXX() calls. Adds the backward compatibility TLVsifgnet_stats_start_copy_compat() was used and releases the statisticslock.
Returns 0 on success or -1 with the statistic lock releasedif the room in the socket buffer was not sufficient.
- int
gen_new_estimator(struct gnet_stats_basic_packed * bstats, struct gnet_stats_basic_cpu __percpu * cpu_bstats, struct net_rate_estimator __rcu ** rate_est, spinlock_t * lock, seqcount_t * running, struct nlattr * opt)¶ create a new rate estimator
Parameters
structgnet_stats_basic_packed*bstats- basic statistics
structgnet_stats_basic_cpu__percpu*cpu_bstats- bstats per cpu
structnet_rate_estimator__rcu**rate_est- rate estimator statistics
spinlock_t*lock- lock for statistics and control path
seqcount_t*running- qdisc running seqcount
structnlattr*opt- rate estimator configuration TLV
Description
Creates a new rate estimator withbstats as source andrate_estas destination. A new timer with the interval specified in theconfiguration TLV is created. Upon each interval, the latest statisticswill be read frombstats and the estimated rate will be stored inrate_est with the statistics lock grabbed during this period.
Returns 0 on success or a negative error code.
- void
gen_kill_estimator(struct net_rate_estimator __rcu ** rate_est)¶ remove a rate estimator
Parameters
structnet_rate_estimator__rcu**rate_est- rate estimator
Description
Removes the rate estimator.
- int
gen_replace_estimator(struct gnet_stats_basic_packed * bstats, struct gnet_stats_basic_cpu __percpu * cpu_bstats, struct net_rate_estimator __rcu ** rate_est, spinlock_t * lock, seqcount_t * running, struct nlattr * opt)¶ replace rate estimator configuration
Parameters
structgnet_stats_basic_packed*bstats- basic statistics
structgnet_stats_basic_cpu__percpu*cpu_bstats- bstats per cpu
structnet_rate_estimator__rcu**rate_est- rate estimator statistics
spinlock_t*lock- lock for statistics and control path
seqcount_t*running- qdisc running seqcount (might be NULL)
structnlattr*opt- rate estimator configuration TLV
Description
Replaces the configuration of a rate estimator by callinggen_kill_estimator() andgen_new_estimator().
Returns 0 on success or a negative error code.
- bool
gen_estimator_active(struct net_rate_estimator __rcu ** rate_est)¶ test if estimator is currently in use
Parameters
structnet_rate_estimator__rcu**rate_est- rate estimator
Description
Returns true if estimator is active, and false if not.
SUN RPC subsystem¶
- __be32 *
xdr_encode_opaque_fixed(__be32 * p, const void * ptr, unsigned int nbytes)¶ Encode fixed length opaque data
Parameters
__be32*p- pointer to current position in XDR buffer.
constvoid*ptr- pointer to data to encode (or NULL)
unsignedintnbytes- size of data.
Description
Copy the array of data of length nbytes at ptr to the XDR bufferat position p, then align to the next 32-bit boundary by paddingwith zero bytes (see RFC1832).Returns the updated current XDR buffer position
Note
if ptr is NULL, only the padding is performed.
- __be32 *
xdr_encode_opaque(__be32 * p, const void * ptr, unsigned int nbytes)¶ Encode variable length opaque data
Parameters
__be32*p- pointer to current position in XDR buffer.
constvoid*ptr- pointer to data to encode (or NULL)
unsignedintnbytes- size of data.
Description
Returns the updated current XDR buffer position
- void
xdr_terminate_string(struct xdr_buf * buf, const u32 len)¶ ‘0’-terminate a string residing in an xdr_buf
Parameters
structxdr_buf*buf- XDR buffer where string resides
constu32len- length of string, in bytes
- void
xdr_inline_pages(struct xdr_buf * xdr, unsigned int offset, struct page ** pages, unsigned int base, unsigned int len)¶ Prepare receive buffer for a large reply
Parameters
structxdr_buf*xdr- xdr_buf into which reply will be placed
unsignedintoffset- expected offset where data payload will start, in bytes
structpage**pages- vector of struct page pointers
unsignedintbase- offset in first page where receive should start, in bytes
unsignedintlen- expected size of the upper layer data payload, in bytes
- void
_copy_from_pages(char * p, struct page ** pages, size_t pgbase, size_t len)¶
Parameters
char*p- pointer to destination
structpage**pages- array of pages
size_tpgbase- offset of source data
size_tlen- length
Description
Copies data into an arbitrary memory location from an array of pagesThe copy is assumed to be non-overlapping.
- unsigned int
xdr_stream_pos(const struct xdr_stream * xdr)¶ Return the current offset from the start of the xdr_stream
Parameters
conststructxdr_stream*xdr- pointer to struct xdr_stream
- void
xdr_init_encode(struct xdr_stream * xdr, struct xdr_buf * buf, __be32 * p, struct rpc_rqst * rqst)¶ Initialize a struct xdr_stream for sending data.
Parameters
structxdr_stream*xdr- pointer to xdr_stream struct
structxdr_buf*buf- pointer to XDR buffer in which to encode data
__be32*p- current pointer inside XDR buffer
structrpc_rqst*rqst- pointer to controlling rpc_rqst, for debugging
Note
- at the moment the RPC client only passes the length of our
- scratch buffer in the xdr_buf’s header kvec. Previously thismeant we needed to call xdr_adjust_iovec() after encoding thedata. With the new scheme, the xdr_stream manages the detailsof the buffer length, and takes care of adjusting the kveclength for us.
- void
xdr_commit_encode(struct xdr_stream * xdr)¶ Ensure all data is written to buffer
Parameters
structxdr_stream*xdr- pointer to xdr_stream
Description
We handle encoding across page boundaries by giving the caller atemporary location to write to, then later copying the data intoplace; xdr_commit_encode does that copying.
Normally the caller doesn’t need to call this directly, as thefollowing xdr_reserve_space will do it. But an explicit call may berequired at the end of encoding, or any other time when the xdr_bufdata might be read.
- __be32 *
xdr_reserve_space(struct xdr_stream * xdr, size_t nbytes)¶ Reserve buffer space for sending
Parameters
structxdr_stream*xdr- pointer to xdr_stream
size_tnbytes- number of bytes to reserve
Description
Checks that we have enough buffer space to encode ‘nbytes’ morebytes of data. If so, update the total xdr_buf length, andadjust the length of the current kvec.
- void
xdr_truncate_encode(struct xdr_stream * xdr, size_t len)¶ truncate an encode buffer
Parameters
structxdr_stream*xdr- pointer to xdr_stream
size_tlen- new length of buffer
Description
Truncates the xdr stream, so that xdr->buf->len == len,and xdr->p points at offset len from the start of the buffer, andhead, tail, and page lengths are adjusted to correspond.
If this means moving xdr->p to a different buffer, we assume thatthat the end pointer should be set to the end of the current page,except in the case of the head buffer when we assume the headbuffer’s current length represents the end of the available buffer.
This isnot safe to use on a buffer that already has inlined pagecache pages (as in a zero-copy server read reply), except for thesimple case of truncating from one position in the tail to another.
- int
xdr_restrict_buflen(struct xdr_stream * xdr, int newbuflen)¶ decrease available buffer space
Parameters
structxdr_stream*xdr- pointer to xdr_stream
intnewbuflen- new maximum number of bytes available
Description
Adjust our idea of how much space is available in the buffer.If we’ve already used too much space in the buffer, returns -1.If the available space is already smaller than newbuflen, returns 0and does nothing. Otherwise, adjusts xdr->buf->buflen to newbuflenand ensures xdr->end is set at most offset newbuflen from the startof the buffer.
- void
xdr_write_pages(struct xdr_stream * xdr, struct page ** pages, unsigned int base, unsigned int len)¶ Insert a list of pages into an XDR buffer for sending
Parameters
structxdr_stream*xdr- pointer to xdr_stream
structpage**pages- list of pages
unsignedintbase- offset of first byte
unsignedintlen- length of data in bytes
- void
xdr_init_decode(struct xdr_stream * xdr, struct xdr_buf * buf, __be32 * p, struct rpc_rqst * rqst)¶ Initialize an xdr_stream for decoding data.
Parameters
structxdr_stream*xdr- pointer to xdr_stream struct
structxdr_buf*buf- pointer to XDR buffer from which to decode data
__be32*p- current pointer inside XDR buffer
structrpc_rqst*rqst- pointer to controlling rpc_rqst, for debugging
- void
xdr_init_decode_pages(struct xdr_stream * xdr, struct xdr_buf * buf, struct page ** pages, unsigned int len)¶ Initialize an xdr_stream for decoding into pages
Parameters
structxdr_stream*xdr- pointer to xdr_stream struct
structxdr_buf*buf- pointer to XDR buffer from which to decode data
structpage**pages- list of pages to decode into
unsignedintlen- length in bytes of buffer in pages
- void
xdr_set_scratch_buffer(struct xdr_stream * xdr, void * buf, size_t buflen)¶ Attach a scratch buffer for decoding data.
Parameters
structxdr_stream*xdr- pointer to xdr_stream struct
void*buf- pointer to an empty buffer
size_tbuflen- size of ‘buf’
Description
The scratch buffer is used when decoding from an array of pages.If anxdr_inline_decode() call spans across page boundaries, thenwe copy the data into the scratch buffer in order to allow linearaccess.
- __be32 *
xdr_inline_decode(struct xdr_stream * xdr, size_t nbytes)¶ Retrieve XDR data to decode
Parameters
structxdr_stream*xdr- pointer to xdr_stream struct
size_tnbytes- number of bytes of data to decode
Description
Check if the input buffer is long enough to enable us to decode‘nbytes’ more bytes of data starting at the current position.If so return the current pointer, then update the currentpointer position.
- unsigned int
xdr_read_pages(struct xdr_stream * xdr, unsigned int len)¶ Ensure page-based XDR data to decode is aligned at current pointer position
Parameters
structxdr_stream*xdr- pointer to xdr_stream struct
unsignedintlen- number of bytes of page data
Description
Moves data beyond the current pointer position from the XDR head[] bufferinto the page list. Any data that lies beyond current position + “len”bytes is moved into the XDR tail[].
Returns the number of XDR encoded bytes now contained in the pages
- void
xdr_enter_page(struct xdr_stream * xdr, unsigned int len)¶ decode data from the XDR page
Parameters
structxdr_stream*xdr- pointer to xdr_stream struct
unsignedintlen- number of bytes of page data
Description
Moves data beyond the current pointer position from the XDR head[] bufferinto the page list. Any data that lies beyond current position + “len”bytes is moved into the XDR tail[]. The current pointer is thenrepositioned at the beginning of the first XDR page.
- int
xdr_buf_subsegment(struct xdr_buf * buf, struct xdr_buf * subbuf, unsigned int base, unsigned int len)¶ set subbuf to a portion of buf
Parameters
structxdr_buf*buf- an xdr buffer
structxdr_buf*subbuf- the result buffer
unsignedintbase- beginning of range in bytes
unsignedintlen- length of range in bytes
Description
setssubbuf to an xdr buffer representing the portion ofbuf oflengthlen starting at offsetbase.
buf andsubbuf may be pointers to the same struct xdr_buf.
Returns -1 if base of length are out of bounds.
- void
xdr_buf_trim(struct xdr_buf * buf, unsigned int len)¶ lop at most “len” bytes off the end of “buf”
Parameters
structxdr_buf*buf- buf to be trimmed
unsignedintlen- number of bytes to reduce “buf” by
Description
Trim an xdr_buf by the given number of bytes by fixing up the lengths. Notethat it’s possible that we’ll trim less than that amount if the xdr_buf istoo small, or if (for instance) it’s all in the head and the parser hasalready read too far into it.
- ssize_t
xdr_stream_decode_opaque(struct xdr_stream * xdr, void * ptr, size_t size)¶ Decode variable length opaque
Parameters
structxdr_stream*xdr- pointer to xdr_stream
void*ptr- location to store opaque data
size_tsize- size of storage bufferptr
Description
- Return values:
- On success, returns size of object stored in*ptr
-EBADMSGon XDR buffer overflow-EMSGSIZEon overflow of storage bufferptr
- ssize_t
xdr_stream_decode_opaque_dup(struct xdr_stream * xdr, void ** ptr, size_t maxlen, gfp_t gfp_flags)¶ Decode and duplicate variable length opaque
Parameters
structxdr_stream*xdr- pointer to xdr_stream
void**ptr- location to store pointer to opaque data
size_tmaxlen- maximum acceptable object size
gfp_tgfp_flags- GFP mask to use
Description
- Return values:
- On success, returns size of object stored in*ptr
-EBADMSGon XDR buffer overflow-EMSGSIZEif the size of the object would exceedmaxlen-ENOMEMon memory allocation failure
- ssize_t
xdr_stream_decode_string(struct xdr_stream * xdr, char * str, size_t size)¶ Decode variable length string
Parameters
structxdr_stream*xdr- pointer to xdr_stream
char*str- location to store string
size_tsize- size of storage bufferstr
Description
- Return values:
- On success, returns length of NUL-terminated string stored in*str
-EBADMSGon XDR buffer overflow-EMSGSIZEon overflow of storage bufferstr
- ssize_t
xdr_stream_decode_string_dup(struct xdr_stream * xdr, char ** str, size_t maxlen, gfp_t gfp_flags)¶ Decode and duplicate variable length string
Parameters
structxdr_stream*xdr- pointer to xdr_stream
char**str- location to store pointer to string
size_tmaxlen- maximum acceptable string length
gfp_tgfp_flags- GFP mask to use
Description
- Return values:
- On success, returns length of NUL-terminated string stored in*ptr
-EBADMSGon XDR buffer overflow-EMSGSIZEif the size of the string would exceedmaxlen-ENOMEMon memory allocation failure
- char *
svc_print_addr(struct svc_rqst * rqstp, char * buf, size_t len)¶ Format rq_addr field for printing
Parameters
structsvc_rqst*rqstp- svc_rqst struct containing address to print
char*buf- target buffer for formatted address
size_tlen- length of target buffer
- void
svc_reserve(struct svc_rqst * rqstp, int space)¶ change the space reserved for the reply to a request.
Parameters
structsvc_rqst*rqstp- The request in question
intspace- new max space to reserve
Description
Each request reserves some space on the output queue of the transportto make sure the reply fits. This function reduces that reservedspace to be the amount of space used already, plusspace.
- struct svc_xprt *
svc_find_xprt(struct svc_serv * serv, const char * xcl_name, struct net * net, const sa_family_t af, const unsigned short port)¶ find an RPC transport instance
Parameters
structsvc_serv*serv- pointer to svc_serv to search
constchar*xcl_name- C string containing transport’s class name
structnet*net- owner net pointer
constsa_family_taf- Address family of transport’s local address
constunsignedshortport- transport’s IP port number
Description
Return the transport instance pointer for the endpoint acceptingconnections/peer traffic from the specified transport class,address family and port.
Specifying 0 for the address family or port is effectively awild-card, and will result in matching the first transport in theservice’s list that has a matching class name.
- int
svc_xprt_names(struct svc_serv * serv, char * buf, const int buflen)¶ format a buffer with a list of transport names
Parameters
structsvc_serv*serv- pointer to an RPC service
char*buf- pointer to a buffer to be filled in
constintbuflen- length of buffer to be filled in
Description
Fills inbuf with a string containing a list of transport names,each name terminated with ‘n’.
Returns positive length of the filled-in string on success; otherwisea negative errno value is returned if an error occurs.
- int
xprt_register_transport(struct xprt_class * transport)¶ register a transport implementation
Parameters
structxprt_class*transport- transport to register
Description
If a transport implementation is loaded as a kernel module, it cancall this interface to make itself known to the RPC client.
Return
0: transport successfully registered-EEXIST: transport already registered-EINVAL: transport module being unloaded
- int
xprt_unregister_transport(struct xprt_class * transport)¶ unregister a transport implementation
Parameters
structxprt_class*transport- transport to unregister
Return
0: transport successfully unregistered-ENOENT: transport never registered
- int
xprt_load_transport(const char * transport_name)¶ load a transport implementation
Parameters
constchar*transport_name- transport to load
Return
0: transport successfully loaded-ENOENT: transport module not available
- int
xprt_reserve_xprt(struct rpc_xprt * xprt, struct rpc_task * task)¶ serialize write access to transports
Parameters
structrpc_xprt*xprt- pointer to the target transport
structrpc_task*task- task that is requesting access to the transport
Description
This prevents mixing the payload of separate requests, and preventstransport connects from colliding with writes. No congestion controlis provided.
- void
xprt_release_xprt(struct rpc_xprt * xprt, struct rpc_task * task)¶ allow other requests to use a transport
Parameters
structrpc_xprt*xprt- transport with other tasks potentially waiting
structrpc_task*task- task that is releasing access to the transport
Description
Note that “task” can be NULL. No congestion control is provided.
- void
xprt_release_xprt_cong(struct rpc_xprt * xprt, struct rpc_task * task)¶ allow other requests to use a transport
Parameters
structrpc_xprt*xprt- transport with other tasks potentially waiting
structrpc_task*task- task that is releasing access to the transport
Description
Note that “task” can be NULL. Another task is awoken to use thetransport if the transport’s congestion window allows it.
- bool
xprt_request_get_cong(struct rpc_xprt * xprt, struct rpc_rqst * req)¶ Request congestion control credits
Parameters
structrpc_xprt*xprt- pointer to transport
structrpc_rqst*req- pointer to RPC request
Description
Useful for transports that require congestion control.
- void
xprt_release_rqst_cong(struct rpc_task * task)¶ housekeeping when request is complete
Parameters
structrpc_task*task- RPC request that recently completed
Description
Useful for transports that require congestion control.
- void
xprt_adjust_cwnd(struct rpc_xprt * xprt, struct rpc_task * task, int result)¶ adjust transport congestion window
Parameters
structrpc_xprt*xprt- pointer to xprt
structrpc_task*task- recently completed RPC request used to adjust window
intresult- result code of completed RPC request
Description
The transport code maintains an estimate on the maximum number of out-standing RPC requests, using a smoothed version of the congestionavoidance implemented in 44BSD. This is basically the Van Jacobsoncongestion algorithm: If a retransmit occurs, the congestion window ishalved; otherwise, it is incremented by 1/cwnd when
- a reply is received and
- a full number of requests are outstanding and
- the congestion window hasn’t been updated recently.
- void
xprt_wake_pending_tasks(struct rpc_xprt * xprt, int status)¶ wake all tasks on a transport’s pending queue
Parameters
structrpc_xprt*xprt- transport with waiting tasks
intstatus- result code to plant in each task before waking it
- void
xprt_wait_for_buffer_space(struct rpc_xprt * xprt)¶ wait for transport output buffer to clear
Parameters
structrpc_xprt*xprt- transport
Description
Note that we only set the timer for the case of RPC_IS_SOFT(), sincewe don’t in general want to force a socket disconnection due toan incomplete RPC call transmission.
- bool
xprt_write_space(struct rpc_xprt * xprt)¶ wake the task waiting for transport output buffer space
Parameters
structrpc_xprt*xprt- transport with waiting tasks
Description
Can be called in a soft IRQ context, so xprt_write_space never sleeps.
- void
xprt_disconnect_done(struct rpc_xprt * xprt)¶ mark a transport as disconnected
Parameters
structrpc_xprt*xprt- transport to flag for disconnect
- void
xprt_force_disconnect(struct rpc_xprt * xprt)¶ force a transport to disconnect
Parameters
structrpc_xprt*xprt- transport to disconnect
- unsigned long
xprt_reconnect_delay(const struct rpc_xprt * xprt)¶ compute the wait before scheduling a connect
Parameters
conststructrpc_xprt*xprt- transport instance
- void
xprt_reconnect_backoff(struct rpc_xprt * xprt, unsigned long init_to)¶ compute the new re-establish timeout
Parameters
structrpc_xprt*xprt- transport instance
unsignedlonginit_to- initial reestablish timeout
- struct rpc_rqst *
xprt_lookup_rqst(struct rpc_xprt * xprt, __be32 xid)¶ find an RPC request corresponding to an XID
Parameters
structrpc_xprt*xprt- transport on which the original request was transmitted
__be32xid- RPC XID of incoming reply
Description
Caller holds xprt->queue_lock.
- void
xprt_pin_rqst(struct rpc_rqst * req)¶ Pin a request on the transport receive list
Parameters
structrpc_rqst*req- Request to pin
Description
Caller must ensure this is atomic with the call toxprt_lookup_rqst()so should be holding xprt->queue_lock.
- void
xprt_unpin_rqst(struct rpc_rqst * req)¶ Unpin a request on the transport receive list
Parameters
structrpc_rqst*req- Request to pin
Description
Caller should be holding xprt->queue_lock.
- void
xprt_update_rtt(struct rpc_task * task)¶ Update RPC RTT statistics
Parameters
structrpc_task*task- RPC request that recently completed
Description
Caller holds xprt->queue_lock.
- void
xprt_complete_rqst(struct rpc_task * task, int copied)¶ called when reply processing is complete
Parameters
structrpc_task*task- RPC request that recently completed
intcopied- actual number of bytes received from the transport
Description
Caller holds xprt->queue_lock.
- void
xprt_wait_for_reply_request_def(struct rpc_task * task)¶ wait for reply
Parameters
structrpc_task*task- pointer to rpc_task
Description
Set a request’s retransmit timeout based on the transport’sdefault timeout parameters. Used by transports that don’t adjustthe retransmit timeout based on round-trip time estimation,and put the task to sleep on the pending queue.
- void
xprt_wait_for_reply_request_rtt(struct rpc_task * task)¶ wait for reply using RTT estimator
Parameters
structrpc_task*task- pointer to rpc_task
Description
Set a request’s retransmit timeout using the RTT estimator,and put the task to sleep on the pending queue.
- struct rpc_xprt *
xprt_get(struct rpc_xprt * xprt)¶ return a reference to an RPC transport.
Parameters
structrpc_xprt*xprt- pointer to the transport
- void
xprt_put(struct rpc_xprt * xprt)¶ release a reference to an RPC transport.
Parameters
structrpc_xprt*xprt- pointer to the transport
- void
rpc_wake_up(struct rpc_wait_queue * queue)¶ wake up all rpc_tasks
Parameters
structrpc_wait_queue*queue- rpc_wait_queue on which the tasks are sleeping
Description
Grabs queue->lock
- void
rpc_wake_up_status(struct rpc_wait_queue * queue, int status)¶ wake up all rpc_tasks and set their status value.
Parameters
structrpc_wait_queue*queue- rpc_wait_queue on which the tasks are sleeping
intstatus- status value to set
Description
Grabs queue->lock
- int
rpc_malloc(struct rpc_task * task)¶ allocate RPC buffer resources
Parameters
structrpc_task*task- RPC task
Description
A single memory region is allocated, which is split between theRPC call and RPC reply that this task is being used for. Whenthis RPC is retired, the memory is released by calling rpc_free.
To prevent rpciod from hanging, this allocator never sleeps,returning -ENOMEM and suppressing warning if the request cannotbe serviced immediately. The caller can arrange to sleep in away that is safe for rpciod.
Most requests are ‘small’ (under 2KiB) and can be serviced from amempool, ensuring that NFS reads and writes can always proceed,and that there is good locality of reference for these buffers.
- void
rpc_free(struct rpc_task * task)¶ free RPC buffer resources allocated via rpc_malloc
Parameters
structrpc_task*task- RPC task
Parameters
structxdr_buf*xdr- target XDR buffer
structsk_buff*skb- source skb
Description
We have set things up such that we perform the checksum of the UDPpacket in parallel with the copies into the RPC client iovec. -DaveM
- struct rpc_iostats *
rpc_alloc_iostats(struct rpc_clnt * clnt)¶ allocate an rpc_iostats structure
Parameters
structrpc_clnt*clnt- RPC program, version, and xprt
- void
rpc_free_iostats(struct rpc_iostats * stats)¶ release an rpc_iostats structure
Parameters
structrpc_iostats*stats- doomed rpc_iostats structure
- void
rpc_count_iostats_metrics(const struct rpc_task * task, struct rpc_iostats * op_metrics)¶ tally up per-task stats
Parameters
conststructrpc_task*task- completed rpc_task
structrpc_iostats*op_metrics- stat structure for OP that will accumulate stats fromtask
- void
rpc_count_iostats(const struct rpc_task * task, struct rpc_iostats * stats)¶ tally up per-task stats
Parameters
conststructrpc_task*task- completed rpc_task
structrpc_iostats*stats- array of stat structures
Description
Uses the statidx fromtask
- int
rpc_queue_upcall(struct rpc_pipe * pipe, struct rpc_pipe_msg * msg)¶ queue an upcall message to userspace
Parameters
structrpc_pipe*pipe- upcall pipe on which to queue given message
structrpc_pipe_msg*msg- message to queue
Description
Call with aninode created by rpc_mkpipe() to queue an upcall.A userspace process may then later read the upcall by performing aread on an open file for this inode. It is up to the caller toinitialize the fields ofmsg (other thanmsg->list) appropriately.
- struct dentry *
rpc_mkpipe_dentry(struct dentry * parent, const char * name, void * private, struct rpc_pipe * pipe)¶ make an rpc_pipefs file for kernel<->userspace communication
Parameters
structdentry*parent- dentry of directory to create new “pipe” in
constchar*name- name of pipe
void*private- private data to associate with the pipe, for the caller’s use
structrpc_pipe*piperpc_pipecontaining input parameters
Description
Data is made available for userspace to read by calls torpc_queue_upcall(). The actual reads will result in calls toops->upcall, which will be called with the file pointer,message, and userspace buffer to copy to.
Writes can come at any time, and do not necessarily have to beresponses to upcalls. They will result in calls tomsg->downcall.
Theprivate argument passed here will be available to all these methodsfrom the file pointer, via RPC_I(file_inode(file))->private.
- int
rpc_unlink(struct dentry * dentry)¶ remove a pipe
Parameters
structdentry*dentry- dentry for the pipe, as returned from rpc_mkpipe
Description
After this call, lookups will no longer find the pipe, and anyattempts to read or write using preexisting opens of the pipe willreturn -EPIPE.
- void
rpc_init_pipe_dir_head(struct rpc_pipe_dir_head * pdh)¶ initialise a struct rpc_pipe_dir_head
Parameters
structrpc_pipe_dir_head*pdh- pointer to struct rpc_pipe_dir_head
- void
rpc_init_pipe_dir_object(struct rpc_pipe_dir_object * pdo, const struct rpc_pipe_dir_object_ops * pdo_ops, void * pdo_data)¶ initialise a struct rpc_pipe_dir_object
Parameters
structrpc_pipe_dir_object*pdo- pointer to struct rpc_pipe_dir_object
conststructrpc_pipe_dir_object_ops*pdo_ops- pointer to const struct rpc_pipe_dir_object_ops
void*pdo_data- pointer to caller-defined data
- int
rpc_add_pipe_dir_object(struct net * net, struct rpc_pipe_dir_head * pdh, struct rpc_pipe_dir_object * pdo)¶ associate a rpc_pipe_dir_object to a directory
Parameters
structnet*net- pointer to struct net
structrpc_pipe_dir_head*pdh- pointer to struct rpc_pipe_dir_head
structrpc_pipe_dir_object*pdo- pointer to struct rpc_pipe_dir_object
- void
rpc_remove_pipe_dir_object(struct net * net, struct rpc_pipe_dir_head * pdh, struct rpc_pipe_dir_object * pdo)¶ remove a rpc_pipe_dir_object from a directory
Parameters
structnet*net- pointer to struct net
structrpc_pipe_dir_head*pdh- pointer to struct rpc_pipe_dir_head
structrpc_pipe_dir_object*pdo- pointer to struct rpc_pipe_dir_object
- struct rpc_pipe_dir_object *
rpc_find_or_alloc_pipe_dir_object(struct net * net, struct rpc_pipe_dir_head * pdh, int (*match)(struct rpc_pipe_dir_object *, void *), struct rpc_pipe_dir_object *(*alloc) (void *), void * data)¶
Parameters
structnet*net- pointer to struct net
structrpc_pipe_dir_head*pdh- pointer to struct rpc_pipe_dir_head
int(*)(structrpc_pipe_dir_object*,void*)match- match struct rpc_pipe_dir_object to data
structrpc_pipe_dir_object*(*)(void*)alloc- allocate a new struct rpc_pipe_dir_object
void*data- user defined data for match() and alloc()
- void
rpcb_getport_async(struct rpc_task * task)¶ obtain the port for a given RPC service on a given host
Parameters
structrpc_task*task- task that is waiting for portmapper request
Description
This one can be called for an ongoing RPC request, and can be used inan async (rpciod) context.
- struct rpc_clnt *
rpc_create(struct rpc_create_args * args)¶ create an RPC client and transport with one call
Parameters
structrpc_create_args*args- rpc_clnt create argument structure
Description
Creates and initializes an RPC transport and an RPC client.
It can ping the server in order to determine if it is up, and to see ifit supports this program and version. RPC_CLNT_CREATE_NOPING disablesthis behavior so asynchronous tasks can also use rpc_create.
- struct rpc_clnt *
rpc_clone_client(struct rpc_clnt * clnt)¶ Clone an RPC client structure
Parameters
structrpc_clnt*clnt- RPC client whose parameters are copied
Description
Returns a fresh RPC client or an ERR_PTR.
- struct rpc_clnt *
rpc_clone_client_set_auth(struct rpc_clnt * clnt, rpc_authflavor_t flavor)¶ Clone an RPC client structure and set its auth
Parameters
structrpc_clnt*clnt- RPC client whose parameters are copied
rpc_authflavor_tflavor- security flavor for new client
Description
Returns a fresh RPC client or an ERR_PTR.
- int
rpc_switch_client_transport(struct rpc_clnt * clnt, struct xprt_create * args, const struct rpc_timeout * timeout)¶
Parameters
structrpc_clnt*clnt- pointer to a struct rpc_clnt
structxprt_create*args- pointer to the new transport arguments
conststructrpc_timeout*timeout- pointer to the new timeout parameters
Description
This function allows the caller to switch the RPC transport for therpc_clnt structure ‘clnt’ to allow it to connect to a mirrored NFSserver, for instance. It assumes that the caller has ensured thatthere are no active RPC tasks by using some form of locking.
Returns zero if “clnt” is now using the new xprt. Otherwise anegative errno is returned, and “clnt” continues to use the oldxprt.
- int
rpc_clnt_iterate_for_each_xprt(struct rpc_clnt * clnt, int (*fn)(struct rpc_clnt *, struct rpc_xprt *, void *), void * data)¶ Apply a function to all transports
Parameters
structrpc_clnt*clnt- pointer to client
int(*)(structrpc_clnt*,structrpc_xprt*,void*)fn- function to apply
void*data- void pointer to function data
Description
Iterates through the list of RPC transports currently attached to theclient and applies the function fn(clnt, xprt, data).
On error, the iteration stops, and the function returns the error value.
- struct rpc_clnt *
rpc_bind_new_program(struct rpc_clnt * old, const struct rpc_program * program, u32 vers)¶ bind a new RPC program to an existing client
Parameters
structrpc_clnt*old- old rpc_client
conststructrpc_program*program- rpc program to set
u32vers- rpc program version
Description
Clones the rpc client and sets up a new RPC program. This is mainlyof use for enabling different RPC programs to share the same transport.The Sun NFSv2/v3 ACL protocol can do this.
- struct rpc_task *
rpc_run_task(const struct rpc_task_setup * task_setup_data)¶ Allocate a new RPC task, then run rpc_execute against it
Parameters
conststructrpc_task_setup*task_setup_data- pointer to task initialisation data
- int
rpc_call_sync(struct rpc_clnt * clnt, const struct rpc_message * msg, int flags)¶ Perform a synchronous RPC call
Parameters
structrpc_clnt*clnt- pointer to RPC client
conststructrpc_message*msg- RPC call parameters
intflags- RPC call flags
- int
rpc_call_async(struct rpc_clnt * clnt, const struct rpc_message * msg, int flags, const struct rpc_call_ops * tk_ops, void * data)¶ Perform an asynchronous RPC call
Parameters
structrpc_clnt*clnt- pointer to RPC client
conststructrpc_message*msg- RPC call parameters
intflags- RPC call flags
conststructrpc_call_ops*tk_ops- RPC call ops
void*data- user call data
- void
rpc_prepare_reply_pages(struct rpc_rqst * req, struct page ** pages, unsigned int base, unsigned int len, unsigned int hdrsize)¶ Prepare to receive a reply data payload into pages
Parameters
structrpc_rqst*req- RPC request to prepare
structpage**pages- vector of struct page pointers
unsignedintbase- offset in first page where receive should start, in bytes
unsignedintlen- expected size of the upper layer data payload, in bytes
unsignedinthdrsize- expected size of upper layer reply header, in XDR words
- size_t
rpc_peeraddr(struct rpc_clnt * clnt, struct sockaddr * buf, size_t bufsize)¶ extract remote peer address from clnt’s xprt
Parameters
structrpc_clnt*clnt- RPC client structure
structsockaddr*buf- target buffer
size_tbufsize- length of target buffer
Description
Returns the number of bytes that are actually in the stored address.
- const char *
rpc_peeraddr2str(struct rpc_clnt * clnt, enum rpc_display_format_t format)¶ return remote peer address in printable format
Parameters
structrpc_clnt*clnt- RPC client structure
enumrpc_display_format_tformat- address format
Description
NB: the lifetime of the memory referenced by the returned pointer isthe same as the rpc_xprt itself. As long as the caller uses thispointer, it must hold the RCU read lock.
- int
rpc_localaddr(struct rpc_clnt * clnt, struct sockaddr * buf, size_t buflen)¶ discover local endpoint address for an RPC client
Parameters
structrpc_clnt*clnt- RPC client structure
structsockaddr*buf- target buffer
size_tbuflen- size of target buffer, in bytes
Description
Returns zero and fills in “buf” and “buflen” if successful;otherwise, a negative errno is returned.
This works even if the underlying transport is not currently connected,or if the upper layer never previously provided a source address.
The result of this function call is transient: multiple calls insuccession may give different results, depending on how localnetworking configuration changes over time.
- struct net *
rpc_net_ns(struct rpc_clnt * clnt)¶ Get the network namespace for this RPC client
Parameters
structrpc_clnt*clnt- RPC client to query
- size_t
rpc_max_payload(struct rpc_clnt * clnt)¶ Get maximum payload size for a transport, in bytes
Parameters
structrpc_clnt*clnt- RPC client to query
Description
For stream transports, this is one RPC record fragment (see RFC1831), as we don’t support multi-record requests yet. For datagramtransports, this is the size of an IP packet minus the IP, UDP, andRPC header sizes.
- size_t
rpc_max_bc_payload(struct rpc_clnt * clnt)¶ Get maximum backchannel payload size, in bytes
Parameters
structrpc_clnt*clnt- RPC client to query
- void
rpc_force_rebind(struct rpc_clnt * clnt)¶ force transport to check that remote port is unchanged
Parameters
structrpc_clnt*clnt- client to rebind
- int
rpc_clnt_test_and_add_xprt(struct rpc_clnt * clnt, struct rpc_xprt_switch * xps, struct rpc_xprt * xprt, void * dummy)¶ Test and add a new transport to a rpc_clnt
Parameters
structrpc_clnt*clnt- pointer to struct rpc_clnt
structrpc_xprt_switch*xps- pointer to struct rpc_xprt_switch,
structrpc_xprt*xprt- pointer struct rpc_xprt
void*dummy- unused
- int
rpc_clnt_setup_test_and_add_xprt(struct rpc_clnt * clnt, struct rpc_xprt_switch * xps, struct rpc_xprt * xprt, void * data)¶
Parameters
structrpc_clnt*clnt- struct rpc_clnt to get the new transport
structrpc_xprt_switch*xps- the rpc_xprt_switch to hold the new transport
structrpc_xprt*xprt- the rpc_xprt to test
void*data- a struct rpc_add_xprt_test pointer that holds the test functionand test function call data
Description
- This is an rpc_clnt_add_xprt setup() function which returns 1 so:
- 1) caller of the test function must dereference the rpc_xprt_switchand the rpc_xprt.2) test function must call rpc_xprt_switch_add_xprt, usually inthe rpc_call_done routine.
Upon success (return of 1), the test function adds the newtransport to the rpc_clnt xprt switch
- int
rpc_clnt_add_xprt(struct rpc_clnt * clnt, struct xprt_create * xprtargs, int (*setup)(struct rpc_clnt *, struct rpc_xprt_switch *, struct rpc_xprt *, void *), void * data)¶ Add a new transport to a rpc_clnt
Parameters
structrpc_clnt*clnt- pointer to struct rpc_clnt
structxprt_create*xprtargs- pointer to struct xprt_create
int(*)(structrpc_clnt*,structrpc_xprt_switch*,structrpc_xprt*,void*)setup- callback to test and/or set up the connection
void*data- pointer to setup function data
Description
Creates a new transport using the parameters set in args andadds it to clnt.If ping is set, then test that connectivity succeeds beforeadding the new transport.
WiMAX¶
- structsk_buff *
wimax_msg_alloc(structwimax_dev * wimax_dev, const char * pipe_name, const void * msg, size_t size, gfp_t gfp_flags)¶ Create a new skb for sending a message to userspace
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
constchar*pipe_name- “named pipe” the message will be sent to
constvoid*msg- pointer to the message data to send
size_tsize- size of the message to send (in bytes), including the header.
gfp_tgfp_flags- flags for memory allocation.
Return
0 if ok, negative errno code on error
Description
Allocates an skb that will contain the message to send to userspace over the messaging pipe and initializes it, copying thepayload.
Once this call is done, you can deliver it withwimax_msg_send().
IMPORTANT:
Don’t useskb_push()/skb_pull()/skb_reserve() on the skb, aswimax_msg_send() depends on skb->data being placed at thebeginning of the user message.
Unlike other WiMAX stack calls, this call can be used way early,even beforewimax_dev_add() is called, as long as thewimax_dev->net_dev pointer is set to point to a propernet_dev. This is so that drivers can use it early in case they needto send stuff around or communicate with user space.
- const void *
wimax_msg_data_len(structsk_buff * msg, size_t * size)¶ Return a pointer and size of a message’s payload
Parameters
structsk_buff*msg- Pointer to a message created with
wimax_msg_alloc() size_t*size- Pointer to where to store the message’s size
Description
Returns the pointer to the message data.
Parameters
structsk_buff*msg- Pointer to a message created with
wimax_msg_alloc()
Parameters
structsk_buff*msg- Pointer to a message created with
wimax_msg_alloc()
- int
wimax_msg_send(structwimax_dev * wimax_dev, structsk_buff * skb)¶ Send a pre-allocated message to user space
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
structsk_buff*skbstructsk_buffreturned bywimax_msg_alloc(). Note theownership ofskb is transferred to this function.
Return
0 if ok, < 0 errno code on error
Description
Sends a free-form message that was preallocated withwimax_msg_alloc() and filled up.
Assumes that once you pass an skb to this function for sending, itowns it and will release it when done (on success).
IMPORTANT:
Don’t useskb_push()/skb_pull()/skb_reserve() on the skb, aswimax_msg_send() depends on skb->data being placed at thebeginning of the user message.
Unlike other WiMAX stack calls, this call can be used way early,even beforewimax_dev_add() is called, as long as thewimax_dev->net_dev pointer is set to point to a propernet_dev. This is so that drivers can use it early in case they needto send stuff around or communicate with user space.
- int
wimax_msg(structwimax_dev * wimax_dev, const char * pipe_name, const void * buf, size_t size, gfp_t gfp_flags)¶ Send a message to user space
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor (properly referenced)
constchar*pipe_name- “named pipe” the message will be sent to
constvoid*buf- pointer to the message to send.
size_tsize- size of the buffer pointed to bybuf (in bytes).
gfp_tgfp_flags- flags for memory allocation.
Return
0 if ok, negative errno code on error.
Description
Sends a free-form message to user space on the devicewimax_dev.
Once theskb is given to this function, who will own it and willrelease it when done (unless it returns error).
NOTES
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
Return
Description
0 if ok and a warm reset was done (the device still exists inthe system).
-ENODEV if a cold/bus reset had to be done (device hasdisconnected and reconnected, so current handle is not validany more).
-EINVAL if the device is not even registered.
Any other negative error code shall be considered asnon-recoverable.
Called when wanting to reset the device for any reason. Device istaken back to power on status.
This call blocks; on successful return, the device has completed thereset process and is ready to operate.
- void
wimax_report_rfkill_hw(structwimax_dev * wimax_dev, enum wimax_rf_state state)¶ Reports changes in the hardware RF switch
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
enumwimax_rf_statestate- New state of the RF Kill switch.
WIMAX_RF_ONradio on,WIMAX_RF_OFFradio off.
Description
When the device detects a change in the state of thehardware RFswitch, it must call this function to let the WiMAX kernel stackknow that the state has changed so it can be properly propagated.
The WiMAX stack caches the state (the driver doesn’t need to). Aswell, as the change is propagated it will come back as a request tochange the software state to mirror the hardware state.
If the device doesn’t have a hardware kill switch, just reportit on initialization as always on (WIMAX_RF_ON, radio on).
- void
wimax_report_rfkill_sw(structwimax_dev * wimax_dev, enum wimax_rf_state state)¶ Reports changes in the software RF switch
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
enumwimax_rf_statestate- New state of the RF kill switch.
WIMAX_RF_ONradio on,WIMAX_RF_OFFradio off.
Description
Reports changes in the software RF switch state to the WiMAX stack.
The main use is during initialization, so the driver can query thedevice for its current software radio kill switch state and feed itto the system.
On the side, the device does not change the software state byitself. In practice, this can happen, as the device might decide toswitch (in software) the radio off for different reasons.
- int
wimax_rfkill(structwimax_dev * wimax_dev, enum wimax_rf_state state)¶ Set the software RF switch state for a WiMAX device
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
enumwimax_rf_statestate- New RF state.
Return
Description
>= 0 toggle state if ok, < 0 errno code on error. The toggle stateis returned as a bitmap, bit 0 being the hardware RF state, bit 1the software RF state.
0 means disabled (WIMAX_RF_ON, radio on), 1 means enabled radiooff (WIMAX_RF_OFF).
Called by the user when he wants to request the WiMAX radio to beswitched on (WIMAX_RF_ON) or off (WIMAX_RF_OFF). WithWIMAX_RF_QUERY, just the current state is returned.
This call will block until the operation is complete.
NOTE
- void
wimax_state_change(structwimax_dev * wimax_dev, enumwimax_st new_state)¶ Set the current state of a WiMAX device
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor (properly referenced)
enumwimax_stnew_state- New state to switch to
Description
This implements the state changes for the wimax devices. It will
- verify that the state transition is legal (for now it’ll justprint a warning if not) according to the table inlinux/wimax.h’s documentation for ‘enum wimax_st’.
- perform the actions needed for leaving the current state andwhichever are needed for entering the new state.
- issue a report to user space indicating the new state (and anoptional payload with information about the new state).
NOTE
wimax_dev must be locked
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
Return
Current state of the device according to its driver.
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor to initialize.
Description
Initializes fields of a freshly allocatedwimax_dev instance. Thisfunction assumes that after allocation, the memory occupied bywimax_dev was zeroed.
- int
wimax_dev_add(structwimax_dev * wimax_dev, structnet_device * net_dev)¶ Register a new WiMAX device
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor (as embedded in yournet_dev’spriv data). You must have called
wimax_dev_init()on it before. structnet_device*net_dev- net device thewimax_dev is associated with. Thefunction expects SET_NETDEV_DEV() and
register_netdev()werealready called on it.
Description
Registers the new WiMAX device, sets up the user-kernel controlinterface (generic netlink) and common WiMAX infrastructure.
Note that the parts that will allow interaction with user space aresetup at the very end, when the rest is in place, as once thathappens, the driver might get user space control requests vianetlink or from debugfs that might translate into calls intowimax_dev->op_*().
Parameters
structwimax_dev*wimax_dev- WiMAX device descriptor
Description
Unregisters a WiMAX device previously registered for use withwimax_add_rm().
IMPORTANT! Must call before callingunregister_netdev().
After this function returns, you will not get any more user spacecontrol requests (via netlink or debugfs) and thus to wimax_dev->ops.
Reentrancy control is ensured by setting the state to__WIMAX_ST_QUIESCING. rfkill operations coming throughwimax_*rfkill*() will be stopped by the quiescing state; ops comingfrom the rfkill subsystem will be stopped by the support beingremoved by wimax_rfkill_rm().
- struct
wimax_dev¶ Generic WiMAX device
Definition
struct wimax_dev { struct net_device *net_dev; struct list_head id_table_node; struct mutex mutex; struct mutex mutex_reset; enum wimax_st state; int (*op_msg_from_user)(struct wimax_dev *wimax_dev,const char *,const void *, size_t, const struct genl_info *info); int (*op_rfkill_sw_toggle)(struct wimax_dev *wimax_dev, enum wimax_rf_state); int (*op_reset)(struct wimax_dev *wimax_dev); struct rfkill *rfkill; unsigned int rf_hw; unsigned int rf_sw; char name[32]; struct dentry *debugfs_dentry;};Members
net_dev- [fill] Pointer to the
structnet_devicethis WiMAXdevice implements. id_table_node- [private] link to the list of wimax devices kept byid-table.c. Protected by it’s own spinlock.
mutex- [private] Serializes all concurrent access and execution ofoperations.
mutex_reset- [private] Serializes reset operations. Needs to be adifferent mutex because as part of the reset operation, thedriver has to call back into the stack to do things such asstate change, that require wimax_dev->mutex.
state- [private] Current state of the WiMAX device.
op_msg_from_user- [fill] Driver-specific operation tohandle a raw message from user space to the driver. Thedriver can send messages to user space using withwimax_msg_to_user().
op_rfkill_sw_toggle- [fill] Driver-specific operation to act onuserspace (or any other agent) requesting the WiMAX device tochange the RF Kill software switch (WIMAX_RF_ON orWIMAX_RF_OFF).If such hardware support is not present, it is assumed theradio cannot be switched off and it is always on (and the stackwill error out when trying to switch it off). In such case,this function pointer can be left as NULL.
op_reset- [fill] Driver specific operation to reset thedevice.This operation should always attempt first a warm reset thatdoes not disconnect the device from the bus and return 0.If that fails, it should resort to some sort of cold or busreset (even if it implies a bus disconnection and devicedisappearance). In that case, -ENODEV should be returned toindicate the device is gone.This operation has to be synchronous, and return only when thereset is complete. In case of having had to resort to bus/coldreset implying a device disconnection, the call is allowed toreturn immediately.
rfkill- [private] integration into the RF-Kill infrastructure.
rf_hw- [private] State of the hardware radio switch (OFF/ON)
rf_sw- [private] State of the software radio switch (OFF/ON)
name- [fill] A way to identify this device. We need to register aname with many subsystems (rfkill, workqueue creation, etc).We can’t use the network device name as thatmight change and in some instances we don’t know it yet (untilwe don’t call
register_netdev()). So we generate an unique oneusing the driver name and device bus id, place it here and useit across the board. Recommended naming:DRIVERNAME-BUSNAME:BUSID (dev->bus->name, dev->bus_id). debugfs_dentry- [private] Used to hook up a debugfs entry. Thisshows up in the debugfs root as wimax:DEVICENAME.
NOTE
- wimax_dev->mutex is NOT locked when this op is being
- called; however, wimax_dev->mutex_reset IS locked to ensureserialization of calls to
wimax_reset().Seewimax_reset()’s documentation.
Description
This structure defines a common interface to access all WiMAXdevices from different vendors and provides a common API as well asa free-form device-specific messaging channel.
- Usage:
- Embed a
structwimax_devatthe beginning the networkdevice structure so thatnetdev_priv()points to it. memset()it to zero- Initialize with
wimax_dev_init(). This will leave the WiMAXdevice in the__WIMAX_ST_NULLstate. - Fill all the fields marked with [fill]; once called
wimax_dev_add(), those fields CANNOT be modified. - Call
wimax_dev_add()after registering the networkdevice. This will leave the WiMAX device in theWIMAX_ST_DOWNstate.Protect the driver’s net_device->open() against succeeding ifthe wimax device state is lower thanWIMAX_ST_DOWN. - Select when the device is going to be turned on/initialized;for example, it could be initialized on ‘ifconfig up’ (when thenetdev op ‘open()’ is called on the driver).
- Embed a
When the device is initialized (atifconfig up time, or rightafter callingwimax_dev_add() from _probe(), make sure thefollowing steps are taken
- Move the device to
WIMAX_ST_UNINITIALIZED. This is needed sosome API calls that shouldn’t work until the device is readycan be blocked.- Initialize the device. Make sure to turn the SW radio switchoff and move the device to state
WIMAX_ST_RADIO_OFFwhendone. When just initialized, a device should be left in RADIOOFF state until user space devices to turn it on.- Query the device for the state of the hardware rfkill switchand call wimax_rfkill_report_hw() and wimax_rfkill_report_sw()as needed. See below.
wimax_dev_rm() undoes before unregistering the network device. Oncewimax_dev_add() is called, the driver can get called on thewimax_dev->op_* function pointers
CONCURRENCY:
The stack provides a mutex for each device that will disallow APIcalls happening concurrently; thus, op calls into the driverthrough the wimax_dev->op*() function pointers will always beserialized andnever concurrent.
For locking, take wimax_dev->mutex is taken; (most) operations inthe API have to check for wimax_dev_is_ready() to return 0 beforecontinuing (this is done internally).
REFERENCE COUNTING:
The WiMAX device is reference counted by the associated networkdevice. The only operation that can be used to reference the deviceis wimax_dev_get_by_genl_info(), and the reference it acquires hasto be released with dev_put(wimax_dev->net_dev).
RFKILL:
At startup, both HW and SW radio switchess are assumed to be off.
At initialization time [after callingwimax_dev_add()], have thedriver query the device for the status of the software and hardwareRF kill switches and callwimax_report_rfkill_hw() andwimax_rfkill_report_sw() to indicate their state. If any ismissing, just call it to indicate it is ON (radio always on).
Whenever the driver detects a change in the state of the RF killswitches, it should callwimax_report_rfkill_hw() orwimax_report_rfkill_sw() to report it to the stack.
- enum
wimax_st¶ The different states of a WiMAX device
Constants
__WIMAX_ST_NULL- The device structure has been allocated and zeroed,but still
wimax_dev_add()hasn’t been called. There is no state. WIMAX_ST_DOWN- The device has been registered with the WiMAX andnetworking stacks, but it is not initialized (normally that isdone with ‘ifconfig DEV up’ [or equivalent], which can uploadfirmware and enable communications with the device).In this state, the device is powered down and using as lesspower as possible.This state is the default after a call to
wimax_dev_add(). Itis ok to have drivers move directly toWIMAX_ST_UNINITIALIZEDorWIMAX_ST_RADIO_OFFin _probe() after the call towimax_dev_add().It is recommended that the driver leaves this state whencalling ‘ifconfig DEV up’ and enters it back on ‘ifconfig DEVdown’. __WIMAX_ST_QUIESCING- The device is being torn down, so no APIoperations are allowed to proceed except the ones needed tocomplete the device clean up process.
WIMAX_ST_UNINITIALIZED- [optional] Communication with the deviceis setup, but the device still requires some configurationbefore being operational.Some WiMAX API calls might work.
WIMAX_ST_RADIO_OFF- The device is fully up; radio is off (wetherby hardware or software switches).It is recommended to always leave the device in this stateafter initialization.
WIMAX_ST_READY- The device is fully up and radio is on.
WIMAX_ST_SCANNING- [optional] The device has been instructed toscan. In this state, the device cannot be actively connected toa network.
WIMAX_ST_CONNECTING- The device is connecting to a network. Thisstate exists because in some devices, the connect process caninclude a number of negotiations between user space, kernelspace and the device. User space needs to know what the deviceis doing. If the connect sequence in a device is atomic andfast, the device can transition directly to CONNECTED
WIMAX_ST_CONNECTED- The device is connected to a network.
__WIMAX_ST_INVALID- This is an invalid state used to mark themaximum numeric value of states.
Description
Transitions from one state to another one are atomic and can onlybe caused in kernel space withwimax_state_change(). To read thestate, usewimax_state_get().
States starting with __ are internal and shall not be used orreferred to by drivers or userspace. They look ugly, but that’s thepoint – if any use is made non-internal to the stack, it is easierto catch on review.
All API operations [with well defined exceptions] will take thedevice mutex before starting and then check the state. If the stateis__WIMAX_ST_NULL,WIMAX_ST_DOWN,WIMAX_ST_UNINITIALIZED or__WIMAX_ST_QUIESCING, it will drop the lock and quit with-EINVAL, -ENOMEDIUM, -ENOTCONN or -ESHUTDOWN.
The order of the definitions is important, so we can do numericalcomparisons (eg: <WIMAX_ST_RADIO_OFF means the device is not readyto operate).
Network device support¶
Driver Support¶
- void
dev_add_pack(struct packet_type * pt)¶ add packet handler
Parameters
structpacket_type*ptpacket type declaration
Add a protocol handler to the networking stack. The passed
packet_typeis linked into kernel lists and may not be freed until it has beenremoved from the kernel lists.This call does not sleep therefore it can notguarantee all CPU’s that are in middle of receiving packetswill see the new packet type (until the next received packet).
- void
__dev_remove_pack(struct packet_type * pt)¶ remove packet handler
Parameters
structpacket_type*ptpacket type declaration
Remove a protocol handler that was previously added to the kernelprotocol handlers by
dev_add_pack(). The passedpacket_typeis removedfrom the kernel lists and can be freed or reused once this functionreturns.The packet type might still be in use by receiversand must not be freed until after all the CPU’s have gonethrough a quiescent state.
- void
dev_remove_pack(struct packet_type * pt)¶ remove packet handler
Parameters
structpacket_type*ptpacket type declaration
Remove a protocol handler that was previously added to the kernelprotocol handlers by
dev_add_pack(). The passedpacket_typeis removedfrom the kernel lists and can be freed or reused once this functionreturns.This call sleeps to guarantee that no CPU is looking at the packettype after return.
- void
dev_add_offload(struct packet_offload * po)¶ register offload handlers
Parameters
structpacket_offload*poprotocol offload declaration
Add protocol offload handlers to the networking stack. The passed
proto_offloadis linked into kernel lists and may not be freed untilit has been removed from the kernel lists.This call does not sleep therefore it can notguarantee all CPU’s that are in middle of receiving packetswill see the new offload handlers (until the next received packet).
- void
dev_remove_offload(struct packet_offload * po)¶ remove packet offload handler
Parameters
structpacket_offload*popacket offload declaration
Remove a packet offload handler that was previously added to the kerneloffload handlers by
dev_add_offload(). The passedoffload_typeisremoved from the kernel lists and can be freed or reused once thisfunction returns.This call sleeps to guarantee that no CPU is looking at the packettype after return.
- int
netdev_boot_setup_check(structnet_device * dev)¶ check boot time settings
Parameters
structnet_device*dev- the netdevice
Description
Check boot time settings for the device.The found settings are set for the device to be usedlater in the device probing.Returns 0 if no settings found, 1 if they are.
- int
dev_get_iflink(const structnet_device * dev)¶ get ‘iflink’ value of a interface
Parameters
conststructnet_device*devtargeted interface
Indicates the ifindex the interface is linked to.Physical interfaces have the same ‘ifindex’ and ‘iflink’ values.
- int
dev_fill_metadata_dst(structnet_device * dev, structsk_buff * skb)¶ Retrieve tunnel egress information.
Parameters
structnet_device*dev- targeted interface
structsk_buff*skbThe packet.
For better visibility of tunnel traffic OVS needs to retrieveegress tunnel information for a packet. Following API allowsuser to get this info.
- structnet_device *
__dev_get_by_name(struct net * net, const char * name)¶ find a device by its name
Parameters
structnet*net- the applicable net namespace
constchar*namename to find
Find an interface by name. Must be called under RTNL semaphoreordev_base_lock. If the name is found a pointer to the deviceis returned. If the name is not found then
NULLis returned. Thereference counters are not incremented so the caller must becareful with locks.
- structnet_device *
dev_get_by_name_rcu(struct net * net, const char * name)¶ find a device by its name
Parameters
structnet*net- the applicable net namespace
constchar*name- name to find
Description
Find an interface by name.If the name is found a pointer to the device is returned.If the name is not found thenNULL is returned.The reference counters are not incremented so the caller must becareful with locks. The caller must hold RCU lock.
- structnet_device *
dev_get_by_name(struct net * net, const char * name)¶ find a device by its name
Parameters
structnet*net- the applicable net namespace
constchar*namename to find
Find an interface by name. This can be called from anycontext and does its own locking. The returned handle hasthe usage count incremented and the caller must use
dev_put()torelease it when it is no longer needed.NULLis returned if nomatching device is found.
- structnet_device *
__dev_get_by_index(struct net * net, int ifindex)¶ find a device by its ifindex
Parameters
structnet*net- the applicable net namespace
intifindexindex of device
Search for an interface by index. Returns
NULLif the deviceis not found or a pointer to the device. The device has nothad its reference counter increased so the caller must be carefulabout locking. The caller must hold either the RTNL semaphoreordev_base_lock.
- structnet_device *
dev_get_by_index_rcu(struct net * net, int ifindex)¶ find a device by its ifindex
Parameters
structnet*net- the applicable net namespace
intifindexindex of device
Search for an interface by index. Returns
NULLif the deviceis not found or a pointer to the device. The device has nothad its reference counter increased so the caller must be carefulabout locking. The caller must hold RCU lock.
- structnet_device *
dev_get_by_index(struct net * net, int ifindex)¶ find a device by its ifindex
Parameters
structnet*net- the applicable net namespace
intifindexindex of device
Search for an interface by index. Returns NULL if the deviceis not found or a pointer to the device. The device returned hashad a reference added and the pointer is safe until the user callsdev_put to indicate they have finished with it.
- structnet_device *
dev_get_by_napi_id(unsigned int napi_id)¶ find a device by napi_id
Parameters
unsignedintnapi_idID of the NAPI struct
Search for an interface by NAPI ID. Returns
NULLif the deviceis not found or a pointer to the device. The device has not hadits reference counter increased so the caller must be carefulabout locking. The caller must hold RCU lock.
- structnet_device *
dev_getbyhwaddr_rcu(struct net * net, unsigned short type, const char * ha)¶ find a device by its hardware address
Parameters
structnet*net- the applicable net namespace
unsignedshorttype- media type of device
constchar*hahardware address
Search for an interface by MAC address. Returns NULL if the deviceis not found or a pointer to the device.The caller must hold RCU or RTNL.The returned device has not had its ref count increasedand the caller must therefore be careful about locking
- structnet_device *
__dev_get_by_flags(struct net * net, unsigned short if_flags, unsigned short mask)¶ find any device with given flags
Parameters
structnet*net- the applicable net namespace
unsignedshortif_flags- IFF_* values
unsignedshortmaskbitmask of bits in if_flags to check
Search for any interface with the given flags. Returns NULL if a deviceis not found or a pointer to the device. Must be called insidertnl_lock(), and result refcount is unchanged.
- bool
dev_valid_name(const char * name)¶ check if name is okay for network device
Parameters
constchar*namename string
Network device names need to be valid file names toto allow sysfs to work. We also disallow any kind ofwhitespace.
- int
dev_alloc_name(structnet_device * dev, const char * name)¶ allocate a name for a device
Parameters
structnet_device*dev- device
constchar*namename format string
Passed a format string - eg “lt``d``” it will try and find a suitableid. It scans list of devices to build up a free map, then choosesthe first empty slot. The caller must hold the dev_base or rtnl lockwhile allocating the name and adding the device in order to avoidduplicates.Limited to bits_per_byte * page size devices (ie 32K on most platforms).Returns the number of the unit assigned or a negative errno code.
- int
dev_set_alias(structnet_device * dev, const char * alias, size_t len)¶ change ifalias of a device
Parameters
structnet_device*dev- device
constchar*alias- name up to IFALIASZ
size_tlenlimit of bytes to copy from info
Set ifalias for a device,
- void
netdev_features_change(structnet_device * dev)¶ device changes features
Parameters
structnet_device*devdevice to cause notification
Called to indicate a device has changed features.
- void
netdev_state_change(structnet_device * dev)¶ device changes state
Parameters
structnet_device*devdevice to cause notification
Called to indicate a device has changed state. This function callsthe notifier chains for netdev_chain and sends a NEWLINK messageto the routing socket.
- void
netdev_notify_peers(structnet_device * dev)¶ notify network peers about existence ofdev
Parameters
structnet_device*dev- network device
Description
Generate traffic such that interested network peers are aware ofdev, such as by generating a gratuitous ARP. This may be used whena device wants to inform the rest of the network about some sort ofreconfiguration such as a failover event or virtual machinemigration.
- int
dev_open(structnet_device * dev, struct netlink_ext_ack * extack)¶ prepare an interface for use.
Parameters
structnet_device*dev- device to open
structnetlink_ext_ack*extacknetlink extended ack
Takes a device from down to up state. The device’s private openfunction is invoked and then the multicast lists are loaded. Finallythe device is moved into the up state and a
NETDEV_UPmessage issent to the netdev notifier chain.Calling this function on an active interface is a nop. On a failurea negative errno code is returned.
- void
dev_close(structnet_device * dev)¶ shutdown an interface.
Parameters
structnet_device*devdevice to shutdown
This function moves an active device into down state. A
NETDEV_GOING_DOWNis sent to the netdev notifier chain. The deviceis then deactivated and finally aNETDEV_DOWNis sent to the notifierchain.
- void
dev_disable_lro(structnet_device * dev)¶ disable Large Receive Offload on a device
Parameters
structnet_device*devdevice
Disable Large Receive Offload (LRO) on a net device. Must becalled under RTNL. This is needed if received packets may beforwarded to another interface.
- int
register_netdevice_notifier(struct notifier_block * nb)¶ register a network notifier block
Parameters
structnotifier_block*nb- notifier
Description
Register a notifier to be called when network device events occur.The notifier passed is linked into the kernel structures and mustnot be reused until it has been unregistered. A negative errno codeis returned on a failure.
When registered all registration and up events are replayedto the new notifier to allow device to have a race freeview of the network device list.
- int
unregister_netdevice_notifier(struct notifier_block * nb)¶ unregister a network notifier block
Parameters
structnotifier_block*nb- notifier
Description
Unregister a notifier previously registered byregister_netdevice_notifier(). The notifier is unlinked into thekernel structures and may then be reused. A negative errno codeis returned on a failure.
After unregistering unregister and down device events are synthesizedfor all devices on the device list to the removed notifier to removethe need for special case cleanup code.
- int
register_netdevice_notifier_net(struct net * net, struct notifier_block * nb)¶ register a per-netns network notifier block
Parameters
structnet*net- network namespace
structnotifier_block*nb- notifier
Description
Register a notifier to be called when network device events occur.The notifier passed is linked into the kernel structures and mustnot be reused until it has been unregistered. A negative errno codeis returned on a failure.
When registered all registration and up events are replayedto the new notifier to allow device to have a race freeview of the network device list.
- int
unregister_netdevice_notifier_net(struct net * net, struct notifier_block * nb)¶ unregister a per-netns network notifier block
Parameters
structnet*net- network namespace
structnotifier_block*nb- notifier
Description
Unregister a notifier previously registered byregister_netdevice_notifier(). The notifier is unlinked into thekernel structures and may then be reused. A negative errno codeis returned on a failure.
After unregistering unregister and down device events are synthesizedfor all devices on the device list to the removed notifier to removethe need for special case cleanup code.
- int
call_netdevice_notifiers(unsigned long val, structnet_device * dev)¶ call all network notifier blocks
Parameters
unsignedlongval- value passed unmodified to notifier function
structnet_device*devnet_device pointer passed unmodified to notifier function
Call all network notifier blocks. Parameters and return valueare as for raw_notifier_call_chain().
- int
dev_forward_skb(structnet_device * dev, structsk_buff * skb)¶ loopback an skb to another netif
Parameters
structnet_device*dev- destination network device
structsk_buff*skb- buffer to forward
Description
- return values:
- NET_RX_SUCCESS (no congestion)NET_RX_DROP (packet was dropped, but freed)
dev_forward_skb can be used for injecting an skb from thestart_xmit function of one device into the receive queueof another device.
The receiving device may be in another namespace, sowe have to clear all information in the skb that couldimpact namespace isolation.
- bool
dev_nit_active(structnet_device * dev)¶ return true if any network interface taps are in use
Parameters
structnet_device*dev- network device to check for the presence of taps
- int
netif_set_real_num_rx_queues(structnet_device * dev, unsigned int rxq)¶ set actual number of RX queues used
Parameters
structnet_device*dev- Network device
unsignedintrxqActual number of RX queues
This must be called either with the rtnl_lock held or beforeregistration of the net device. Returns 0 on success, or anegative error code. If called before registration, it alwayssucceeds.
- int
netif_get_num_default_rss_queues(void)¶ default number of RSS queues
Parameters
void- no arguments
Description
This routine should set an upper limit on the number of RSS queuesused by default by multiqueue devices.
- void
netif_device_detach(structnet_device * dev)¶ mark device as removed
Parameters
structnet_device*dev- network device
Description
Mark device as removed from system and therefore no longer available.
- void
netif_device_attach(structnet_device * dev)¶ mark device as attached
Parameters
structnet_device*dev- network device
Description
Mark device as attached from system and restart if needed.
- structsk_buff *
skb_mac_gso_segment(structsk_buff * skb, netdev_features_t features)¶ mac layer segmentation handler.
Parameters
structsk_buff*skb- buffer to segment
netdev_features_tfeatures- features for the output path (see dev->features)
- structsk_buff *
__skb_gso_segment(structsk_buff * skb, netdev_features_t features, bool tx_path)¶ Perform segmentation on skb.
Parameters
structsk_buff*skb- buffer to segment
netdev_features_tfeatures- features for the output path (see dev->features)
booltx_pathwhether it is called in TX path
This function segments the given skb and returns a list of segments.
It may return NULL if the skb requires no segmentation. This isonly possible when GSO is used for verifying header integrity.
Segmentation preserves SKB_GSO_CB_OFFSET bytes of previous skb cb.
Parameters
structnet*net- network namespace this loopback is happening in
structsock*sk- sk needed to be a netfilter okfn
structsk_buff*skb- buffer to transmit
- bool
rps_may_expire_flow(structnet_device * dev, u16 rxq_index, u32 flow_id, u16 filter_id)¶ check whether an RFS hardware filter may be removed
Parameters
structnet_device*dev- Device on which the filter was set
u16rxq_index- RX queue index
u32flow_id- Flow ID passed to ndo_rx_flow_steer()
u16filter_id- Filter ID returned by ndo_rx_flow_steer()
Description
Drivers that implement ndo_rx_flow_steer() should periodically callthis function for each installed filter and remove the filters forwhich it returnstrue.
Parameters
structsk_buff*skbbuffer to post
This function receives a packet from a device driver and queues it forthe upper (protocol) levels to process. It always succeeds. The buffermay be dropped during processing for congestion control or by theprotocol layers.
return values:NET_RX_SUCCESS (no congestion)NET_RX_DROP (packet was dropped)
- bool
netdev_is_rx_handler_busy(structnet_device * dev)¶ check if receive handler is registered
Parameters
structnet_device*devdevice to check
Check if a receive handler is already registered for a given device.Return true if there one.
The caller must hold the rtnl_mutex.
- int
netdev_rx_handler_register(structnet_device * dev, rx_handler_func_t * rx_handler, void * rx_handler_data)¶ register receive handler
Parameters
structnet_device*dev- device to register a handler for
rx_handler_func_t*rx_handler- receive handler to register
void*rx_handler_datadata pointer that is used by rx handler
Register a receive handler for a device. This handler will then becalled from __netif_receive_skb. A negative errno code is returnedon a failure.
The caller must hold the rtnl_mutex.
For a general description of rx_handler, see enum rx_handler_result.
- void
netdev_rx_handler_unregister(structnet_device * dev)¶ unregister receive handler
Parameters
structnet_device*devdevice to unregister a handler from
Unregister a receive handler from a device.
The caller must hold the rtnl_mutex.
Parameters
structsk_buff*skbbuffer to process
More direct receive version of
netif_receive_skb(). It shouldonly be used by callers that have a need to skip RPS and Generic XDP.Caller must also take care of handling if(page_is_)pfmemalloc.This function may only be called from softirq context and interruptsshould be enabled.
Return values (usually ignored):NET_RX_SUCCESS: no congestionNET_RX_DROP: packet was dropped
Parameters
structsk_buff*skbbuffer to process
netif_receive_skb()is the main receive data processing function.It always succeeds. The buffer may be dropped during processingfor congestion control or by the protocol layers.This function may only be called from softirq context and interruptsshould be enabled.
Return values (usually ignored):NET_RX_SUCCESS: no congestionNET_RX_DROP: packet was dropped
- void
netif_receive_skb_list(struct list_head * head)¶ process many receive buffers from network
Parameters
structlist_head*headlist of skbs to process.
Since return value of
netif_receive_skb()is normally ignored, andwouldn’t be meaningful for a list, this function returns void.This function may only be called from softirq context and interruptsshould be enabled.
- void
__napi_schedule(struct napi_struct * n)¶ schedule for receive
Parameters
structnapi_struct*n- entry to schedule
Description
The entry’s receive function will be scheduled to run.Consider using__napi_schedule_irqoff() if hard irqs are masked.
- bool
napi_schedule_prep(struct napi_struct * n)¶ check if napi can be scheduled
Parameters
structnapi_struct*n- napi context
Description
Test if NAPI routine is already running, and if not markit as running. This is used as a condition variableinsure only one NAPI poll instance runs. We also makesure there is no pending NAPI disable.
- void
__napi_schedule_irqoff(struct napi_struct * n)¶ schedule for receive
Parameters
structnapi_struct*n- entry to schedule
Description
Variant of__napi_schedule() assuming hard irqs are masked
- bool
netdev_has_upper_dev(structnet_device * dev, structnet_device * upper_dev)¶ Check if device is linked to an upper device
Parameters
structnet_device*dev- device
structnet_device*upper_dev- upper device to check
Description
Find out if a device is linked to specified upper device and return truein case it is. Note that this checks only immediate upper device,not through a complete stack of devices. The caller must hold the RTNL lock.
- bool
netdev_has_upper_dev_all_rcu(structnet_device * dev, structnet_device * upper_dev)¶ Check if device is linked to an upper device
Parameters
structnet_device*dev- device
structnet_device*upper_dev- upper device to check
Description
Find out if a device is linked to specified upper device and return truein case it is. Note that this checks the entire upper device chain.The caller must hold rcu lock.
- bool
netdev_has_any_upper_dev(structnet_device * dev)¶ Check if device is linked to some device
Parameters
structnet_device*dev- device
Description
Find out if a device is linked to an upper device and return true in caseit is. The caller must hold the RTNL lock.
- structnet_device *
netdev_master_upper_dev_get(structnet_device * dev)¶ Get master upper device
Parameters
structnet_device*dev- device
Description
Find a master upper device and return pointer to it or NULL in caseit’s not there. The caller must hold the RTNL lock.
- structnet_device *
netdev_upper_get_next_dev_rcu(structnet_device * dev, struct list_head ** iter)¶ Get the next dev from upper list
Parameters
structnet_device*dev- device
structlist_head**iter- list_head ** of the current position
Description
Gets the next device from the dev’s upper list, starting from iterposition. The caller must hold RCU read lock.
- void *
netdev_lower_get_next_private(structnet_device * dev, struct list_head ** iter)¶ Get the next ->private from the lower neighbour list
Parameters
structnet_device*dev- device
structlist_head**iter- list_head ** of the current position
Description
Gets the next netdev_adjacent->private from the dev’s lower neighbourlist, starting from iter position. The caller must hold either hold theRTNL lock or its own locking that guarantees that the neighbour lowerlist will remain unchanged.
- void *
netdev_lower_get_next_private_rcu(structnet_device * dev, struct list_head ** iter)¶ Get the next ->private from the lower neighbour list, RCU variant
Parameters
structnet_device*dev- device
structlist_head**iter- list_head ** of the current position
Description
Gets the next netdev_adjacent->private from the dev’s lower neighbourlist, starting from iter position. The caller must hold RCU read lock.
- void *
netdev_lower_get_next(structnet_device * dev, struct list_head ** iter)¶ Get the next device from the lower neighbour list
Parameters
structnet_device*dev- device
structlist_head**iter- list_head ** of the current position
Description
Gets the next netdev_adjacent from the dev’s lower neighbourlist, starting from iter position. The caller must hold RTNL lock orits own locking that guarantees that the neighbour lowerlist will remain unchanged.
- void *
netdev_lower_get_first_private_rcu(structnet_device * dev)¶ Get the first ->private from the lower neighbour list, RCU variant
Parameters
structnet_device*dev- device
Description
Gets the first netdev_adjacent->private from the dev’s lower neighbourlist. The caller must hold RCU read lock.
- structnet_device *
netdev_master_upper_dev_get_rcu(structnet_device * dev)¶ Get master upper device
Parameters
structnet_device*dev- device
Description
Find a master upper device and return pointer to it or NULL in caseit’s not there. The caller must hold the RCU read lock.
- int
netdev_upper_dev_link(structnet_device * dev, structnet_device * upper_dev, struct netlink_ext_ack * extack)¶ Add a link to the upper device
Parameters
structnet_device*dev- device
structnet_device*upper_dev- new upper device
structnetlink_ext_ack*extack- netlink extended ack
Description
Adds a link to device which is upper to this one. The caller must holdthe RTNL lock. On a failure a negative errno code is returned.On success the reference counts are adjusted and the functionreturns zero.
- int
netdev_master_upper_dev_link(structnet_device * dev, structnet_device * upper_dev, void * upper_priv, void * upper_info, struct netlink_ext_ack * extack)¶ Add a master link to the upper device
Parameters
structnet_device*dev- device
structnet_device*upper_dev- new upper device
void*upper_priv- upper device private
void*upper_info- upper info to be passed down via notifier
structnetlink_ext_ack*extack- netlink extended ack
Description
Adds a link to device which is upper to this one. In this case, onlyone master upper device can be linked, although other non-master devicesmight be linked as well. The caller must hold the RTNL lock.On a failure a negative errno code is returned. On success the referencecounts are adjusted and the function returns zero.
- void
netdev_upper_dev_unlink(structnet_device * dev, structnet_device * upper_dev)¶ Removes a link to upper device
Parameters
structnet_device*dev- device
structnet_device*upper_dev- new upper device
Description
Removes a link to device which is upper to this one. The caller must holdthe RTNL lock.
- void
netdev_bonding_info_change(structnet_device * dev, struct netdev_bonding_info * bonding_info)¶ Dispatch event about slave change
Parameters
structnet_device*dev- device
structnetdev_bonding_info*bonding_info- info to dispatch
Description
Send NETDEV_BONDING_INFO to netdev notifiers with info.The caller must hold the RTNL lock.
- structnet_device *
netdev_get_xmit_slave(structnet_device * dev, structsk_buff * skb, bool all_slaves)¶ Get the xmit slave of master device
Parameters
structnet_device*dev- device
structsk_buff*skb- The packet
boolall_slaves- assume all the slaves are active
Description
The reference counters are not incremented so the caller must becareful with locks. The caller must hold RCU lock.NULL is returned if no slave is found.
- void
netdev_lower_state_changed(structnet_device * lower_dev, void * lower_state_info)¶ Dispatch event about lower device state change
Parameters
structnet_device*lower_dev- device
void*lower_state_info- state to dispatch
Description
Send NETDEV_CHANGELOWERSTATE to netdev notifiers with info.The caller must hold the RTNL lock.
- int
dev_set_promiscuity(structnet_device * dev, int inc)¶ update promiscuity count on a device
Parameters
structnet_device*dev- device
intincmodifier
Add or remove promiscuity from a device. While the count in the deviceremains above zero the interface remains promiscuous. Once it hits zerothe device reverts back to normal filtering operation. A negative incvalue is used to drop promiscuity on the device.Return 0 if successful or a negative errno code on error.
- int
dev_set_allmulti(structnet_device * dev, int inc)¶ update allmulti count on a device
Parameters
structnet_device*dev- device
intincmodifier
Add or remove reception of all multicast frames to a device. While thecount in the device remains above zero the interface remains listeningto all interfaces. Once it hits zero the device reverts back to normalfiltering operation. A negativeinc value is used to drop the counterwhen releasing a resource needing all multicasts.Return 0 if successful or a negative errno code on error.
- unsigned int
dev_get_flags(const structnet_device * dev)¶ get flags reported to userspace
Parameters
conststructnet_device*devdevice
Get the combination of flag bits exported through APIs to userspace.
- int
dev_change_flags(structnet_device * dev, unsigned int flags, struct netlink_ext_ack * extack)¶ change device settings
Parameters
structnet_device*dev- device
unsignedintflags- device state flags
structnetlink_ext_ack*extacknetlink extended ack
Change settings on device based state flags. The flags arein the userspace exported format.
- void
dev_set_group(structnet_device * dev, int new_group)¶ Change group this device belongs to
Parameters
structnet_device*dev- device
intnew_group- group this device should belong to
- int
dev_pre_changeaddr_notify(structnet_device * dev, const char * addr, struct netlink_ext_ack * extack)¶ Call NETDEV_PRE_CHANGEADDR.
Parameters
structnet_device*dev- device
constchar*addr- new address
structnetlink_ext_ack*extack- netlink extended ack
- int
dev_set_mac_address(structnet_device * dev, struct sockaddr * sa, struct netlink_ext_ack * extack)¶ Change Media Access Control Address
Parameters
structnet_device*dev- device
structsockaddr*sa- new address
structnetlink_ext_ack*extacknetlink extended ack
Change the hardware (MAC) address of the device
- int
dev_change_carrier(structnet_device * dev, bool new_carrier)¶ Change device carrier
Parameters
structnet_device*dev- device
boolnew_carriernew value
Change device carrier
- int
dev_get_phys_port_id(structnet_device * dev, struct netdev_phys_item_id * ppid)¶ Get device physical port ID
Parameters
structnet_device*dev- device
structnetdev_phys_item_id*ppidport ID
Get device physical port ID
- int
dev_get_phys_port_name(structnet_device * dev, char * name, size_t len)¶ Get device physical port name
Parameters
structnet_device*dev- device
char*name- port name
size_tlenlimit of bytes to copy to name
Get device physical port name
- int
dev_get_port_parent_id(structnet_device * dev, struct netdev_phys_item_id * ppid, bool recurse)¶ Get the device’s port parent identifier
Parameters
structnet_device*dev- network device
structnetdev_phys_item_id*ppid- pointer to a storage for the port’s parent identifier
boolrecurseallow/disallow recursion to lower devices
Get the devices’s port parent identifier
- bool
netdev_port_same_parent_id(structnet_device * a, structnet_device * b)¶ Indicate if two network devices have the same port parent identifier
Parameters
structnet_device*a- first network device
structnet_device*b- second network device
- int
dev_change_proto_down(structnet_device * dev, bool proto_down)¶ update protocol port state information
Parameters
structnet_device*dev- device
boolproto_downnew value
This info can be used by switch drivers to set the phys state of theport.
- int
dev_change_proto_down_generic(structnet_device * dev, bool proto_down)¶ generic implementation for ndo_change_proto_down that sets carrier according to proto_down.
Parameters
structnet_device*dev- device
boolproto_down- new value
- void
dev_change_proto_down_reason(structnet_device * dev, unsigned long mask, u32 value)¶ proto down reason
Parameters
structnet_device*dev- device
unsignedlongmask- proto down mask
u32value- proto down value
- void
netdev_update_features(structnet_device * dev)¶ recalculate device features
Parameters
structnet_device*devthe device to check
Recalculate dev->features set and send notifications if ithas changed. Should be called after driver or hardware dependentconditions might have changed that influence the features.
- void
netdev_change_features(structnet_device * dev)¶ recalculate device features
Parameters
structnet_device*devthe device to check
Recalculate dev->features set and send notifications evenif they have not changed. Should be called instead of
netdev_update_features()if also dev->vlan_features mighthave changed to allow the changes to be propagated to stackedVLAN devices.
- void
netif_stacked_transfer_operstate(const structnet_device * rootdev, structnet_device * dev)¶ transfer operstate
Parameters
conststructnet_device*rootdev- the root or lower level device to transfer state from
structnet_device*devthe device to transfer operstate to
Transfer operational state from root to device. This is normallycalled when a stacking relationship exists between the rootdevice and the device(a leaf device).
- int
register_netdevice(structnet_device * dev)¶ register a network device
Parameters
structnet_device*devdevice to register
Take a completed network device structure and add it to the kernelinterfaces. A
NETDEV_REGISTERmessage is sent to the netdev notifierchain. 0 is returned on success. A negative errno code is returnedon a failure to set up the device, or if the name is a duplicate.Callers must hold the rtnl semaphore. You may want
register_netdev()instead of this.BUGS:The locking appears insufficient to guarantee two parallel registerswill not get the same name.
- int
init_dummy_netdev(structnet_device * dev)¶ init a dummy network device for NAPI
Parameters
structnet_device*devdevice to init
This takes a network device structure and initialize the minimumamount of fields so it can be used to schedule NAPI polls withoutregistering a full blown interface. This is to be used by driversthat need to tie several hardware interfaces to a single NAPIpoll scheduler due to HW limitations.
- int
register_netdev(structnet_device * dev)¶ register a network device
Parameters
structnet_device*devdevice to register
Take a completed network device structure and add it to the kernelinterfaces. A
NETDEV_REGISTERmessage is sent to the netdev notifierchain. 0 is returned on success. A negative errno code is returnedon a failure to set up the device, or if the name is a duplicate.This is a wrapper around register_netdevice that takes the rtnl semaphoreand expands the device name if you passed a format string toalloc_netdev.
- struct rtnl_link_stats64 *
dev_get_stats(structnet_device * dev, struct rtnl_link_stats64 * storage)¶ get network device statistics
Parameters
structnet_device*dev- device to get statistics from
structrtnl_link_stats64*storageplace to store stats
Get network statistics from device. Returnstorage.The device driver may provide its own method by settingdev->netdev_ops->get_stats64 or dev->netdev_ops->get_stats;otherwise the internal statistics structure is used.
- structnet_device *
alloc_netdev_mqs(int sizeof_priv, const char * name, unsigned char name_assign_type, void (*setup)(structnet_device *), unsigned int txqs, unsigned int rxqs)¶ allocate network device
Parameters
intsizeof_priv- size of private data to allocate space for
constchar*name- device name format string
unsignedcharname_assign_type- origin of device name
void(*)(structnet_device*)setup- callback to initialize device
unsignedinttxqs- the number of TX subqueues to allocate
unsignedintrxqs- the number of RX subqueues to allocate
Description
Allocates a struct net_device with private data area for driver useand performs basic initialization. Also allocates subqueue structsfor each queue on the device.
- void
free_netdev(structnet_device * dev)¶ free network device
Parameters
structnet_device*dev- device
Description
This function does the last stage of destroying an allocated deviceinterface. The reference to the device object is released. If thisis the last reference then it will be freed.Must be called in processcontext.
- void
synchronize_net(void)¶ Synchronize with packet receive processing
Parameters
void- no arguments
Description
Wait for packets currently being received to be done.Does not block later packets from starting.
- void
unregister_netdevice_queue(structnet_device * dev, struct list_head * head)¶ remove device from the kernel
Parameters
structnet_device*dev- device
structlist_head*headlist
This function shuts down a device interface and removes itfrom the kernel tables.If head not NULL, device is queued to be unregistered later.
Callers must hold the rtnl semaphore. You may want
unregister_netdev()instead of this.
- void
unregister_netdevice_many(struct list_head * head)¶ unregister many devices
Parameters
structlist_head*head- list of devices
Note
- As most callers use a stack allocated list_head,
- we force a
list_del()to make sure stack wont be corrupted later.
- void
unregister_netdev(structnet_device * dev)¶ remove device from the kernel
Parameters
structnet_device*devdevice
This function shuts down a device interface and removes itfrom the kernel tables.
This is just a wrapper for unregister_netdevice that takesthe rtnl semaphore. In general you want to use this and notunregister_netdevice.
- int
dev_change_net_namespace(structnet_device * dev, struct net * net, const char * pat)¶ move device to different nethost namespace
Parameters
structnet_device*dev- device
structnet*net- network namespace
constchar*patIf not NULL name pattern to try if the current device nameis already taken in the destination network namespace.
This function shuts down a device interface and moves itto a new network namespace. On success 0 is returned, ona failure a netagive errno code is returned.
Callers must hold the rtnl semaphore.
- netdev_features_t
netdev_increment_features(netdev_features_t all, netdev_features_t one, netdev_features_t mask)¶ increment feature set by one
Parameters
netdev_features_tall- current feature set
netdev_features_tone- new feature set
netdev_features_tmaskmask feature set
Computes a new feature set after adding a device with feature setone to the master device with current feature setall. Will notenable anything that is off inmask. Returns the new feature set.
- int
eth_header(structsk_buff * skb, structnet_device * dev, unsigned short type, const void * daddr, const void * saddr, unsigned int len)¶ create the Ethernet header
Parameters
structsk_buff*skb- buffer to alter
structnet_device*dev- source device
unsignedshorttype- Ethernet type field
constvoid*daddr- destination address (NULL leave destination address)
constvoid*saddr- source address (NULL use device source address)
unsignedintlen- packet length (<= skb->len)
Description
Set the protocol type. For a packet of type ETH_P_802_3/2 we put the lengthin here instead.
- u32
eth_get_headlen(const structnet_device * dev, void * data, unsigned int len)¶ determine the length of header for an ethernet frame
Parameters
conststructnet_device*dev- pointer to network device
void*data- pointer to start of frame
unsignedintlen- total length of frame
Description
Make a best effort attempt to pull the length for all of the headers fora given frame in a linear buffer.
- __be16
eth_type_trans(structsk_buff * skb, structnet_device * dev)¶ determine the packet’s protocol ID.
Parameters
structsk_buff*skb- received socket data
structnet_device*dev- receiving network device
Description
The rule here is that weassume 802.3 if the type field is short enough to be a length.This is normal practice and works for any ‘now in use’ protocol.
- int
eth_header_parse(const structsk_buff * skb, unsigned char * haddr)¶ extract hardware address from packet
Parameters
conststructsk_buff*skb- packet to extract header from
unsignedchar*haddr- destination buffer
- int
eth_header_cache(const struct neighbour * neigh, struct hh_cache * hh, __be16 type)¶ fill cache entry from neighbour
Parameters
conststructneighbour*neigh- source neighbour
structhh_cache*hh- destination cache entry
__be16type- Ethernet type field
Description
Create an Ethernet header template from the neighbour.
- void
eth_header_cache_update(struct hh_cache * hh, const structnet_device * dev, const unsigned char * haddr)¶ update cache entry
Parameters
structhh_cache*hh- destination cache entry
conststructnet_device*dev- network device
constunsignedchar*haddr- new hardware address
Description
Called by Address Resolution module to notify changes in address.
Parameters
conststructsk_buff*skb- packet to extract protocol from
- int
eth_prepare_mac_addr_change(structnet_device * dev, void * p)¶ prepare for mac change
Parameters
structnet_device*dev- network device
void*p- socket address
- void
eth_commit_mac_addr_change(structnet_device * dev, void * p)¶ commit mac change
Parameters
structnet_device*dev- network device
void*p- socket address
- int
eth_mac_addr(structnet_device * dev, void * p)¶ set new Ethernet hardware address
Parameters
structnet_device*dev- network device
void*p- socket address
Description
Change hardware address of device.
This doesn’t change hardware matching, so needs to be overriddenfor most real devices.
- void
ether_setup(structnet_device * dev)¶ setup Ethernet network device
Parameters
structnet_device*dev- network device
Description
Fill in the fields of the device structure with Ethernet-generic values.
- structnet_device *
alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs, unsigned int rxqs)¶ Allocates and sets up an Ethernet device
Parameters
intsizeof_priv- Size of additional driver-private structure to be allocatedfor this Ethernet device
unsignedinttxqs- The number of TX queues this device has.
unsignedintrxqs- The number of RX queues this device has.
Description
Fill in the fields of the device structure with Ethernet-genericvalues. Basically does everything except registering the device.
Constructs a new net device, complete with a private data area ofsize (sizeof_priv). A 32-byte (not bit) alignment is enforced forthis private data area.
Parameters
structdevice*dev- Device with which the mac-address cell is associated.
void*addrbuf- Buffer to which the MAC address will be copied on success.
Description
Returns 0 on success or a negative error number on failure.
- void
netif_carrier_on(structnet_device * dev)¶ set carrier
Parameters
structnet_device*dev- network device
Description
Device has detected acquisition of carrier.
- void
netif_carrier_off(structnet_device * dev)¶ clear carrier
Parameters
structnet_device*dev- network device
Description
Device has detected loss of carrier.
- bool
is_link_local_ether_addr(const u8 * addr)¶ Determine if given Ethernet address is link-local
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Return true if address is link local reserved addr (01:80:c2:00:00:0X) perIEEE 802.1Q 8.6.3 Frame filtering.
Please note: addr must be aligned to u16.
- bool
is_zero_ether_addr(const u8 * addr)¶ Determine if give Ethernet address is all zeros.
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Return true if the address is all zeroes.
Please note: addr must be aligned to u16.
- bool
is_multicast_ether_addr(const u8 * addr)¶ Determine if the Ethernet address is a multicast.
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Return true if the address is a multicast address.By definition the broadcast address is also a multicast address.
- bool
is_local_ether_addr(const u8 * addr)¶ Determine if the Ethernet address is locally-assigned one (IEEE 802).
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Return true if the address is a local address.
- bool
is_broadcast_ether_addr(const u8 * addr)¶ Determine if the Ethernet address is broadcast
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Return true if the address is the broadcast address.
Please note: addr must be aligned to u16.
- bool
is_unicast_ether_addr(const u8 * addr)¶ Determine if the Ethernet address is unicast
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Return true if the address is a unicast address.
- bool
is_valid_ether_addr(const u8 * addr)¶ Determine if the given Ethernet address is valid
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Check that the Ethernet address (MAC) is not 00:00:00:00:00:00, is nota multicast address, and is not FF:FF:FF:FF:FF:FF.
Return true if the address is valid.
Please note: addr must be aligned to u16.
- bool
eth_proto_is_802_3(__be16 proto)¶ Determine if a given Ethertype/length is a protocol
Parameters
__be16proto- Ethertype/length value to be tested
Description
Check that the value from the Ethertype/length field is a valid Ethertype.
Return true if the valid is an 802.3 supported Ethertype.
- void
eth_random_addr(u8 * addr)¶ Generate software assigned random Ethernet address
Parameters
u8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Generate a random Ethernet address (MAC) that is not multicastand has the local assigned bit set.
- void
eth_broadcast_addr(u8 * addr)¶ Assign broadcast address
Parameters
u8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Assign the broadcast address to the given address array.
- void
eth_zero_addr(u8 * addr)¶ Assign zero address
Parameters
u8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Assign the zero address to the given address array.
- void
eth_hw_addr_random(structnet_device * dev)¶ Generate software assigned random Ethernet and set device flag
Parameters
structnet_device*dev- pointer to net_device structure
Description
Generate a random Ethernet address (MAC) to be used by a net deviceand set addr_assign_type so the state can be read by sysfs and beused by userspace.
- u32
eth_hw_addr_crc(struct netdev_hw_addr * ha)¶ Calculate CRC from netdev_hw_addr
Parameters
structnetdev_hw_addr*ha- pointer to hardware address
Description
Calculate CRC from a hardware address as basis for filter hashes.
- void
ether_addr_copy(u8 * dst, const u8 * src)¶ Copy an Ethernet address
Parameters
u8*dst- Pointer to a six-byte array Ethernet address destination
constu8*src- Pointer to a six-byte array Ethernet address source
Description
Please note: dst & src must both be aligned to u16.
- void
eth_hw_addr_inherit(structnet_device * dst, structnet_device * src)¶ Copy dev_addr from another net_device
Parameters
structnet_device*dst- pointer to net_device to copy dev_addr to
structnet_device*src- pointer to net_device to copy dev_addr from
Description
Copy the Ethernet address from one net_device to another along withthe address attributes (addr_assign_type).
- bool
ether_addr_equal(const u8 * addr1, const u8 * addr2)¶ Compare two Ethernet addresses
Parameters
constu8*addr1- Pointer to a six-byte array containing the Ethernet address
constu8*addr2- Pointer other six-byte array containing the Ethernet address
Description
Compare two Ethernet addresses, returns true if equal
Please note: addr1 & addr2 must both be aligned to u16.
- bool
ether_addr_equal_64bits(const u8 addr1, const u8 addr2)¶ Compare two Ethernet addresses
Parameters
constu8addr1- Pointer to an array of 8 bytes
constu8addr2- Pointer to an other array of 8 bytes
Description
Compare two Ethernet addresses, returns true if equal, false otherwise.
The function doesn’t need any conditional branches and possibly usesword memory accesses on CPU allowing cheap unaligned memory reads.arrays = { byte1, byte2, byte3, byte4, byte5, byte6, pad1, pad2 }
Please note that alignment of addr1 & addr2 are only guaranteed to be 16 bits.
- bool
ether_addr_equal_unaligned(const u8 * addr1, const u8 * addr2)¶ Compare two not u16 aligned Ethernet addresses
Parameters
constu8*addr1- Pointer to a six-byte array containing the Ethernet address
constu8*addr2- Pointer other six-byte array containing the Ethernet address
Description
Compare two Ethernet addresses, returns true if equal
Please note: Use only when any Ethernet address may not be u16 aligned.
- bool
ether_addr_equal_masked(const u8 * addr1, const u8 * addr2, const u8 * mask)¶ Compare two Ethernet addresses with a mask
Parameters
constu8*addr1- Pointer to a six-byte array containing the 1st Ethernet address
constu8*addr2- Pointer to a six-byte array containing the 2nd Ethernet address
constu8*mask- Pointer to a six-byte array containing the Ethernet address bitmask
Description
Compare two Ethernet addresses with a mask, returns true if for every bitset in the bitmask the equivalent bits in the ethernet addresses are equal.Using a mask with all bits set is a slower ether_addr_equal.
- u64
ether_addr_to_u64(const u8 * addr)¶ Convert an Ethernet address into a u64 value.
Parameters
constu8*addr- Pointer to a six-byte array containing the Ethernet address
Description
Return a u64 value of the address
- void
u64_to_ether_addr(u64 u, u8 * addr)¶ Convert a u64 to an Ethernet address.
Parameters
u64u- u64 to convert to an Ethernet MAC address
u8*addr- Pointer to a six-byte array to contain the Ethernet address
- void
eth_addr_dec(u8 * addr)¶ Decrement the given MAC address
Parameters
u8*addr- Pointer to a six-byte array containing Ethernet address to decrement
- void
eth_addr_inc(u8 * addr)¶ Increment the given MAC address.
Parameters
u8*addr- Pointer to a six-byte array containing Ethernet address to increment.
- bool
is_etherdev_addr(const structnet_device * dev, const u8 addr)¶ Tell if given Ethernet address belongs to the device.
Parameters
conststructnet_device*dev- Pointer to a device structure
constu8addr- Pointer to a six-byte array containing the Ethernet address
Description
Compare passed address with all addresses of the device. Return true if theaddress if one of the device addresses.
Note that this function callsether_addr_equal_64bits() so take care ofthe right padding.
- unsigned long
compare_ether_header(const void * a, const void * b)¶ Compare two Ethernet headers
Parameters
constvoid*a- Pointer to Ethernet header
constvoid*b- Pointer to Ethernet header
Description
Compare two Ethernet headers, returns 0 if equal.This assumes that the network header (i.e., IP header) is 4-bytealigned OR the platform can handle unaligned access. This is thecase for all packets coming into netif_receive_skb or similarentry points.
Parameters
structsk_buff*skb- Buffer to pad
Description
An Ethernet frame should have a minimum size of 60 bytes. This functiontakes short frames and pads them with zeros up to the 60 byte limit.
- void
napi_schedule(struct napi_struct * n)¶ schedule NAPI poll
Parameters
structnapi_struct*n- NAPI context
Description
Schedule NAPI poll routine to be called if it is not alreadyrunning.
- void
napi_schedule_irqoff(struct napi_struct * n)¶ schedule NAPI poll
Parameters
structnapi_struct*n- NAPI context
Description
Variant ofnapi_schedule(), assuming hard irqs are masked.
- bool
napi_complete(struct napi_struct * n)¶ NAPI processing complete
Parameters
structnapi_struct*n- NAPI context
Description
Mark NAPI processing as complete.Consider using napi_complete_done() instead.Return false if device should avoid rearming interrupts.
- bool
napi_hash_del(struct napi_struct * napi)¶ remove a NAPI from global table
Parameters
structnapi_struct*napi- NAPI context
Description
Warning: caller must observe RCU grace periodbefore freeing memory containingnapi, ifthis function returns true.
Note
core networking stack automatically calls itfromnetif_napi_del().Drivers might want to call this helper to combine allthe needed RCU grace periods into a single one.
- void
napi_disable(struct napi_struct * n)¶ prevent NAPI from scheduling
Parameters
structnapi_struct*n- NAPI context
Description
Stop NAPI from being scheduled on this context.Waits till any outstanding processing completes.
- void
napi_enable(struct napi_struct * n)¶ enable NAPI scheduling
Parameters
structnapi_struct*n- NAPI context
Description
Resume NAPI from being scheduled on this context.Must be paired with napi_disable.
- void
napi_synchronize(const struct napi_struct * n)¶ wait until NAPI is not running
Parameters
conststructnapi_struct*n- NAPI context
Description
Wait until NAPI is done being scheduled on this context.Waits till any outstanding processing completes butdoes not disable future activations.
- bool
napi_if_scheduled_mark_missed(struct napi_struct * n)¶ if napi is running, set the NAPIF_STATE_MISSED
Parameters
structnapi_struct*n- NAPI context
Description
If napi is running, set the NAPIF_STATE_MISSED, and return true ifNAPI is scheduled.
- enum
netdev_priv_flags¶ structnet_devicepriv_flags
Constants
IFF_802_1Q_VLAN- 802.1Q VLAN device
IFF_EBRIDGE- Ethernet bridging device
IFF_BONDING- bonding master or slave
IFF_ISATAP- ISATAP interface (RFC4214)
IFF_WAN_HDLC- WAN HDLC device
IFF_XMIT_DST_RELEASE- dev_hard_start_xmit() is allowed torelease skb->dst
IFF_DONT_BRIDGE- disallow bridging this ether dev
IFF_DISABLE_NETPOLL- disable netpoll at run-time
IFF_MACVLAN_PORT- device used as macvlan port
IFF_BRIDGE_PORT- device used as bridge port
IFF_OVS_DATAPATH- device used as Open vSwitch datapath port
IFF_TX_SKB_SHARING- The interface supports sharing skbs on transmit
IFF_UNICAST_FLT- Supports unicast filtering
IFF_TEAM_PORT- device used as team port
IFF_SUPP_NOFCS- device supports sending custom FCS
IFF_LIVE_ADDR_CHANGE- device supports hardware addresschange when it’s running
IFF_MACVLAN- Macvlan device
IFF_XMIT_DST_RELEASE_PERM- IFF_XMIT_DST_RELEASE not taking into accountunderlying stacked devices
IFF_L3MDEV_MASTER- device is an L3 master device
IFF_NO_QUEUE- device can run without qdisc attached
IFF_OPENVSWITCH- device is a Open vSwitch master
IFF_L3MDEV_SLAVE- device is enslaved to an L3 master device
IFF_TEAM- device is a team device
IFF_RXFH_CONFIGURED- device has had Rx Flow indirection table configured
IFF_PHONY_HEADROOM- the headroom value is controlled by an externalentity (i.e. the master device for bridged veth)
IFF_MACSEC- device is a MACsec device
IFF_NO_RX_HANDLER- device doesn’t support the rx_handler hook
IFF_FAILOVER- device is a failover master device
IFF_FAILOVER_SLAVE- device is lower dev of a failover master device
IFF_L3MDEV_RX_HANDLER- only invoke the rx handler of L3 master device
IFF_LIVE_RENAME_OK- rename is allowed while device is up and running
Description
These are thestructnet_device, they are only set internallyby drivers and used in the kernel. These flags are invisible touserspace; this means that the order of these flags can changeduring any kernel release.
You should have a pretty good reason to be extending these flags.
- struct
net_device¶ The DEVICE structure.
Definition
struct net_device { char name[IFNAMSIZ]; struct netdev_name_node *name_node; struct dev_ifalias __rcu *ifalias; unsigned long mem_end; unsigned long mem_start; unsigned long base_addr; int irq; unsigned long state; struct list_head dev_list; struct list_head napi_list; struct list_head unreg_list; struct list_head close_list; struct list_head ptype_all; struct list_head ptype_specific; struct { struct list_head upper; struct list_head lower; } adj_list; netdev_features_t features; netdev_features_t hw_features; netdev_features_t wanted_features; netdev_features_t vlan_features; netdev_features_t hw_enc_features; netdev_features_t mpls_features; netdev_features_t gso_partial_features; int ifindex; int group; struct net_device_stats stats; atomic_long_t rx_dropped; atomic_long_t tx_dropped; atomic_long_t rx_nohandler; atomic_t carrier_up_count; atomic_t carrier_down_count;#ifdef CONFIG_WIRELESS_EXT; const struct iw_handler_def *wireless_handlers; struct iw_public_data *wireless_data;#endif; const struct net_device_ops *netdev_ops; const struct ethtool_ops *ethtool_ops;#ifdef CONFIG_NET_L3_MASTER_DEV; const struct l3mdev_ops *l3mdev_ops;#endif;#if IS_ENABLED(CONFIG_IPV6); const struct ndisc_ops *ndisc_ops;#endif;#ifdef CONFIG_XFRM_OFFLOAD; const struct xfrmdev_ops *xfrmdev_ops;#endif;#if IS_ENABLED(CONFIG_TLS_DEVICE); const struct tlsdev_ops *tlsdev_ops;#endif; const struct header_ops *header_ops; unsigned int flags; unsigned int priv_flags; unsigned short gflags; unsigned short padded; unsigned char operstate; unsigned char link_mode; unsigned char if_port; unsigned char dma; unsigned int mtu; unsigned int min_mtu; unsigned int max_mtu; unsigned short type; unsigned short hard_header_len; unsigned char min_header_len; unsigned char name_assign_type; unsigned short needed_headroom; unsigned short needed_tailroom; unsigned char perm_addr[MAX_ADDR_LEN]; unsigned char addr_assign_type; unsigned char addr_len; unsigned char upper_level; unsigned char lower_level; unsigned short neigh_priv_len; unsigned short dev_id; unsigned short dev_port; spinlock_t addr_list_lock; struct netdev_hw_addr_list uc; struct netdev_hw_addr_list mc; struct netdev_hw_addr_list dev_addrs;#ifdef CONFIG_SYSFS; struct kset *queues_kset;#endif;#ifdef CONFIG_LOCKDEP; struct list_head unlink_list;#endif; unsigned int promiscuity; unsigned int allmulti; bool uc_promisc;#ifdef CONFIG_LOCKDEP; unsigned char nested_level;#endif;#if IS_ENABLED(CONFIG_VLAN_8021Q); struct vlan_info __rcu *vlan_info;#endif;#if IS_ENABLED(CONFIG_NET_DSA); struct dsa_port *dsa_ptr;#endif;#if IS_ENABLED(CONFIG_TIPC); struct tipc_bearer __rcu *tipc_ptr;#endif;#if IS_ENABLED(CONFIG_IRDA) || IS_ENABLED(CONFIG_ATALK); void *atalk_ptr;#endif; struct in_device __rcu *ip_ptr;#if IS_ENABLED(CONFIG_DECNET); struct dn_dev __rcu *dn_ptr;#endif; struct inet6_dev __rcu *ip6_ptr;#if IS_ENABLED(CONFIG_AX25); void *ax25_ptr;#endif; struct wireless_dev *ieee80211_ptr; struct wpan_dev *ieee802154_ptr;#if IS_ENABLED(CONFIG_MPLS_ROUTING); struct mpls_dev __rcu *mpls_ptr;#endif; unsigned char *dev_addr; struct netdev_rx_queue *_rx; unsigned int num_rx_queues; unsigned int real_num_rx_queues; struct bpf_prog __rcu *xdp_prog; unsigned long gro_flush_timeout; int napi_defer_hard_irqs; rx_handler_func_t __rcu *rx_handler; void __rcu *rx_handler_data;#ifdef CONFIG_NET_CLS_ACT; struct mini_Qdisc __rcu *miniq_ingress;#endif; struct netdev_queue __rcu *ingress_queue;#ifdef CONFIG_NETFILTER_INGRESS; struct nf_hook_entries __rcu *nf_hooks_ingress;#endif; unsigned char broadcast[MAX_ADDR_LEN];#ifdef CONFIG_RFS_ACCEL; struct cpu_rmap *rx_cpu_rmap;#endif; struct hlist_node index_hlist; struct netdev_queue *_tx ; unsigned int num_tx_queues; unsigned int real_num_tx_queues; struct Qdisc *qdisc; unsigned int tx_queue_len; spinlock_t tx_global_lock; struct xdp_dev_bulk_queue __percpu *xdp_bulkq;#ifdef CONFIG_XPS; struct xps_dev_maps __rcu *xps_cpus_map; struct xps_dev_maps __rcu *xps_rxqs_map;#endif;#ifdef CONFIG_NET_CLS_ACT; struct mini_Qdisc __rcu *miniq_egress;#endif;#ifdef CONFIG_NET_SCHED; unsigned long qdisc_hash[1 << ((4) - 1)];#endif; struct timer_list watchdog_timer; int watchdog_timeo; u32 proto_down_reason; struct list_head todo_list; int __percpu *pcpu_refcnt; struct list_head link_watch_list; enum { NETREG_UNINITIALIZED=0, NETREG_REGISTERED, NETREG_UNREGISTERING, NETREG_UNREGISTERED, NETREG_RELEASED, NETREG_DUMMY, } reg_state:8; bool dismantle; enum { RTNL_LINK_INITIALIZED, RTNL_LINK_INITIALIZING, } rtnl_link_state:16; bool needs_free_netdev; void (*priv_destructor)(struct net_device *dev);#ifdef CONFIG_NETPOLL; struct netpoll_info __rcu *npinfo;#endif; possible_net_t nd_net; union { void *ml_priv; struct pcpu_lstats __percpu *lstats; struct pcpu_sw_netstats __percpu *tstats; struct pcpu_dstats __percpu *dstats; };#if IS_ENABLED(CONFIG_GARP); struct garp_port __rcu *garp_port;#endif;#if IS_ENABLED(CONFIG_MRP); struct mrp_port __rcu *mrp_port;#endif; struct device dev; const struct attribute_group *sysfs_groups[4]; const struct attribute_group *sysfs_rx_queue_group; const struct rtnl_link_ops *rtnl_link_ops;#define GSO_MAX_SIZE 65536; unsigned int gso_max_size;#define GSO_MAX_SEGS 65535; u16 gso_max_segs;#ifdef CONFIG_DCB; const struct dcbnl_rtnl_ops *dcbnl_ops;#endif; s16 num_tc; struct netdev_tc_txq tc_to_txq[TC_MAX_QUEUE]; u8 prio_tc_map[TC_BITMASK + 1];#if IS_ENABLED(CONFIG_FCOE); unsigned int fcoe_ddp_xid;#endif;#if IS_ENABLED(CONFIG_CGROUP_NET_PRIO); struct netprio_map __rcu *priomap;#endif; struct phy_device *phydev; struct sfp_bus *sfp_bus; struct lock_class_key *qdisc_tx_busylock; struct lock_class_key *qdisc_running_key; bool proto_down; unsigned wol_enabled:1; struct list_head net_notifier_list;#if IS_ENABLED(CONFIG_MACSEC); const struct macsec_ops *macsec_ops;#endif; const struct udp_tunnel_nic_info *udp_tunnel_nic_info; struct udp_tunnel_nic *udp_tunnel_nic; struct bpf_xdp_entity xdp_state[__MAX_XDP_MODE];};Members
name- This is the first field of the “visible” part of this structure(i.e. as seen by users in the “Space.c” file). It is the nameof the interface.
name_node- Name hashlist node
ifalias- SNMP alias
mem_end- Shared memory end
mem_start- Shared memory start
base_addr- Device I/O address
irq- Device IRQ number
state- Generic network queuing layer state, see netdev_state_t
dev_list- The global list of network devices
napi_list- List entry used for polling NAPI devices
unreg_list- List entry when we are unregistering thedevice; see the function unregister_netdev
close_list- List entry used when we are closing the device
ptype_all- Device-specific packet handlers for all protocols
ptype_specific- Device-specific, protocol-specific packet handlers
adj_list- Directly linked devices, like slaves for bonding
features- Currently active device features
hw_features- User-changeable features
wanted_features- User-requested features
vlan_features- Mask of features inheritable by VLAN devices
hw_enc_features- Mask of features inherited by encapsulating devicesThis field indicates what encapsulationoffloads the hardware is capable of doing,and drivers will need to set them appropriately.
mpls_features- Mask of features inheritable by MPLS
gso_partial_features- value(s) from NETIF_F_GSO*
ifindex- interface index
group- The group the device belongs to
stats- Statistics struct, which was left as a legacy, usertnl_link_stats64 instead
rx_dropped- Dropped packets by core network,do not use this in drivers
tx_dropped- Dropped packets by core network,do not use this in drivers
rx_nohandler- nohandler dropped packets by core network oninactive devices, do not use this in drivers
carrier_up_count- Number of times the carrier has been up
carrier_down_count- Number of times the carrier has been down
wireless_handlers- List of functions to handle Wireless Extensions,instead of ioctl,see <net/iw_handler.h> for details.
wireless_data- Instance data managed by the core of wireless extensions
netdev_ops- Includes several pointers to callbacks,if one wants to override the ndo_*() functions
ethtool_ops- Management operations
l3mdev_ops- Layer 3 master device operations
ndisc_ops- Includes callbacks for different IPv6 neighbourdiscovery handling. Necessary for e.g. 6LoWPAN.
xfrmdev_ops- Transformation offload operations
tlsdev_ops- Transport Layer Security offload operations
header_ops- Includes callbacks for creating,parsing,caching,etcof Layer 2 headers.
flags- Interface flags (a la BSD)
priv_flags- Like ‘flags’ but invisible to userspace,see if.h for the definitions
gflags- Global flags ( kept as legacy )
padded- How much padding added by alloc_netdev()
operstate- RFC2863 operstate
link_mode- Mapping policy to operstate
if_port- Selectable AUI, TP, …
dma- DMA channel
mtu- Interface MTU value
min_mtu- Interface Minimum MTU value
max_mtu- Interface Maximum MTU value
type- Interface hardware type
hard_header_len- Maximum hardware header length.
min_header_len- Minimum hardware header length
name_assign_type- network interface name assignment type
needed_headroom- Extra headroom the hardware may need, but not in allcases can this be guaranteed
needed_tailroomExtra tailroom the hardware may need, but not in allcases can this be guaranteed. Some cases also useLL_MAX_HEADER instead to allocate the skb
interface address info:perm_addr- Permanent hw address
addr_assign_type- Hw address assignment type
addr_len- Hardware address length
upper_level- Maximum depth level of upper devices.
lower_level- Maximum depth level of lower devices.
neigh_priv_len- Used in neigh_alloc()
dev_id- Used to differentiate devices that sharethe same link layer address
dev_port- Used to differentiate devices that sharethe same function
addr_list_lock- XXX: need comments on this one
uc- unicast mac addresses
mc- multicast mac addresses
dev_addrs- list of device hw addresses
queues_kset- Group of all Kobjects in the Tx and RX queues
unlink_listAs netif_addr_lock() can be called recursively,keep a list of interfaces to be deleted.
FIXME: cleanup struct net_device such that network protocol infomoves out.promiscuity- Number of times the NIC is told to work inpromiscuous mode; if it becomes 0 the NIC willexit promiscuous mode
allmulti- Counter, enables or disables allmulticast mode
uc_promisc- Counter that indicates promiscuous modehas been enabled due to the need to listen toadditional unicast addresses in a device thatdoes not implement ndo_set_rx_mode()
nested_level- Used as as a parameter of spin_lock_nested() ofdev->addr_list_lock.
vlan_info- VLAN info
dsa_ptr- dsa specific data
tipc_ptr- TIPC specific data
atalk_ptr- AppleTalk link
ip_ptr- IPv4 specific data
dn_ptr- DECnet specific data
ip6_ptr- IPv6 specific data
ax25_ptr- AX.25 specific data
ieee80211_ptr- IEEE 802.11 specific data, assign before registering
ieee802154_ptr- IEEE 802.15.4 low-rate Wireless Personal Area Networkdevice struct
mpls_ptr- mpls_dev struct pointer
dev_addr- Hw address (before bcast,because most packets are unicast)
_rx- Array of RX queues
num_rx_queues- Number of RX queuesallocated at
register_netdev()time real_num_rx_queues- Number of RX queues currently active in device
xdp_prog- XDP sockets filter program pointer
gro_flush_timeout- timeout for GRO layer in NAPI
napi_defer_hard_irqs- If not zero, provides a counter that wouldallow to avoid NIC hard IRQ, on busy queues.
rx_handler- handler for received packets
rx_handler_data- XXX: need comments on this one
miniq_ingress- ingress/clsact qdisc specific data foringress processing
ingress_queue- XXX: need comments on this one
nf_hooks_ingress- netfilter hooks executed for ingress packets
broadcast- hw bcast address
rx_cpu_rmap- CPU reverse-mapping for RX completion interrupts,indexed by RX queue number. Assigned by driver.This must only be set if the ndo_rx_flow_steeroperation is defined
index_hlist- Device index hash chain
_tx- Array of TX queues
num_tx_queues- Number of TX queues allocated at alloc_netdev_mq() time
real_num_tx_queues- Number of TX queues currently active in device
qdisc- Root qdisc from userspace point of view
tx_queue_len- Max frames per queue allowed
tx_global_lock- XXX: need comments on this one
xdp_bulkq- XDP device bulk queue
xps_cpus_map- all CPUs map for XPS device
xps_rxqs_map- all RXQs map for XPS device
miniq_egress- clsact qdisc specific data foregress processing
qdisc_hash- qdisc hash table
watchdog_timer- List of timers
watchdog_timeo- Represents the timeout that is used bythe watchdog (see dev_watchdog())
proto_down_reason- reason a netdev interface is held down
todo_list- Delayed register/unregister
pcpu_refcnt- Number of references to this device
link_watch_list- XXX: need comments on this one
reg_state- Register/unregister state machine
dismantle- Device is going to be freed
rtnl_link_state- This enum represents the phases of creatinga new link
needs_free_netdev- Should unregister perform free_netdev?
priv_destructor- Called from unregister
npinfo- XXX: need comments on this one
nd_net- Network namespace this network device is inside
{unnamed_union}- anonymous
ml_priv- Mid-layer private
lstats- Loopback statistics
tstats- Tunnel statistics
dstats- Dummy statistics
garp_port- GARP
mrp_port- MRP
dev- Class/net/name entry
sysfs_groups- Space for optional device, statistics and wirelesssysfs groups
sysfs_rx_queue_group- Space for optional per-rx queue attributes
rtnl_link_ops- Rtnl_link_ops
gso_max_size- Maximum size of generic segmentation offload
gso_max_segs- Maximum number of segments that can be passed to theNIC for GSO
dcbnl_ops- Data Center Bridging netlink ops
num_tc- Number of traffic classes in the net device
tc_to_txq- XXX: need comments on this one
prio_tc_map- XXX: need comments on this one
fcoe_ddp_xid- Max exchange id for FCoE LRO by ddp
priomap- XXX: need comments on this one
phydev- Physical device may attach itselffor hardware timestamping
sfp_bus- attached
structsfp_busstructure. qdisc_tx_busylock- lockdep class annotating Qdisc->busylock spinlock
qdisc_running_key- lockdep class annotating Qdisc->running seqcount
proto_down- protocol port state information can be sent to theswitch driver and used to set the phys state of theswitch port.
wol_enabled- Wake-on-LAN is enabled
net_notifier_list- List of per-net netdev notifier blockthat follow this device when it is movedto another network namespace.
macsec_ops- MACsec offloading ops
udp_tunnel_nic_info- static structure describing the UDP tunneloffload capabilities of the device
udp_tunnel_nic- UDP tunnel offload state
xdp_state- stores info on attached XDP BPF programs
Description
Actually, this whole structure is a big mistake. It mixes I/Odata with strictly “high-level” data, and it has to know aboutalmost every data structure used in the INET module.
- void *
netdev_priv(const structnet_device * dev)¶ access network device private data
Parameters
conststructnet_device*dev- network device
Description
Get network device private data
- void
netif_napi_add(structnet_device * dev, struct napi_struct * napi, int (*poll)(struct napi_struct *, int), int weight)¶ initialize a NAPI context
Parameters
structnet_device*dev- network device
structnapi_struct*napi- NAPI context
int(*)(structnapi_struct*,int)poll- polling function
intweight- default weight
Description
netif_napi_add() must be used to initialize a NAPI context prior to callingany of the other NAPI-related functions.
- void
netif_tx_napi_add(structnet_device * dev, struct napi_struct * napi, int (*poll)(struct napi_struct *, int), int weight)¶ initialize a NAPI context
Parameters
structnet_device*dev- network device
structnapi_struct*napi- NAPI context
int(*)(structnapi_struct*,int)poll- polling function
intweight- default weight
Description
This variant ofnetif_napi_add() should be used from drivers using NAPIto exclusively poll a TX queue.This will avoid we add it into napi_hash[], thus polluting this hash table.
- void
netif_napi_del(struct napi_struct * napi)¶ remove a NAPI context
Parameters
structnapi_struct*napiNAPI context
netif_napi_del()removes a NAPI context from the network device NAPI list
- void
netif_start_queue(structnet_device * dev)¶ allow transmit
Parameters
structnet_device*devnetwork device
Allow upper layers to call the device hard_start_xmit routine.
- void
netif_wake_queue(structnet_device * dev)¶ restart transmit
Parameters
structnet_device*devnetwork device
Allow upper layers to call the device hard_start_xmit routine.Used for flow control when transmit resources are available.
- void
netif_stop_queue(structnet_device * dev)¶ stop transmitted packets
Parameters
structnet_device*devnetwork device
Stop upper layers calling the device hard_start_xmit routine.Used for flow control when transmit resources are unavailable.
- bool
netif_queue_stopped(const structnet_device * dev)¶ test if transmit queue is flowblocked
Parameters
conststructnet_device*devnetwork device
Test if transmit queue on device is currently unable to send.
- void
netdev_txq_bql_enqueue_prefetchw(struct netdev_queue * dev_queue)¶ prefetch bql data for write
Parameters
structnetdev_queue*dev_queue- pointer to transmit queue
Description
BQL enabled drivers might use this helper in their ndo_start_xmit(),to give appropriate hint to the CPU.
- void
netdev_txq_bql_complete_prefetchw(struct netdev_queue * dev_queue)¶ prefetch bql data for write
Parameters
structnetdev_queue*dev_queue- pointer to transmit queue
Description
BQL enabled drivers might use this helper in their TX completion path,to give appropriate hint to the CPU.
- void
netdev_sent_queue(structnet_device * dev, unsigned int bytes)¶ report the number of bytes queued to hardware
Parameters
structnet_device*dev- network device
unsignedintbytesnumber of bytes queued to the hardware device queue
Report the number of bytes queued for sending/completion to the networkdevice hardware queue.bytes should be a good approximation and shouldexactly match
netdev_completed_queue()bytes
- void
netdev_completed_queue(structnet_device * dev, unsigned int pkts, unsigned int bytes)¶ report bytes and packets completed by device
Parameters
structnet_device*dev- network device
unsignedintpkts- actual number of packets sent over the medium
unsignedintbytesactual number of bytes sent over the medium
Report the number of bytes and packets transmitted by the network devicehardware queue over the physical medium,bytes must exactly match thebytes amount passed to
netdev_sent_queue()
- void
netdev_reset_queue(structnet_device * dev_queue)¶ reset the packets and bytes count of a network device
Parameters
structnet_device*dev_queuenetwork device
Reset the bytes and packet count of a network device and clear thesoftware flow control OFF bit for this network device
- u16
netdev_cap_txqueue(structnet_device * dev, u16 queue_index)¶ check if selected tx queue exceeds device queues
Parameters
structnet_device*dev- network device
u16queue_indexgiven tx queue index
Returns 0 if given tx queue index >= number of device tx queues,otherwise returns the originally passed tx queue index.
- bool
netif_running(const structnet_device * dev)¶ test if up
Parameters
conststructnet_device*devnetwork device
Test if the device has been brought up.
- void
netif_start_subqueue(structnet_device * dev, u16 queue_index)¶ allow sending packets on subqueue
Parameters
structnet_device*dev- network device
u16queue_index- sub queue index
Description
Start individual transmit queue of a device with multiple transmit queues.
- void
netif_stop_subqueue(structnet_device * dev, u16 queue_index)¶ stop sending packets on subqueue
Parameters
structnet_device*dev- network device
u16queue_index- sub queue index
Description
Stop individual transmit queue of a device with multiple transmit queues.
- bool
__netif_subqueue_stopped(const structnet_device * dev, u16 queue_index)¶ test status of subqueue
Parameters
conststructnet_device*dev- network device
u16queue_index- sub queue index
Description
Check individual transmit queue of a device with multiple transmit queues.
- void
netif_wake_subqueue(structnet_device * dev, u16 queue_index)¶ allow sending packets on subqueue
Parameters
structnet_device*dev- network device
u16queue_index- sub queue index
Description
Resume individual transmit queue of a device with multiple transmit queues.
- bool
netif_attr_test_mask(unsigned long j, const unsigned long * mask, unsigned int nr_bits)¶ Test a CPU or Rx queue set in a mask
Parameters
unsignedlongj- CPU/Rx queue index
constunsignedlong*mask- bitmask of all cpus/rx queues
unsignedintnr_bits- number of bits in the bitmask
Description
Test if a CPU or Rx queue index is set in a mask of all CPU/Rx queues.
- bool
netif_attr_test_online(unsigned long j, const unsigned long * online_mask, unsigned int nr_bits)¶ Test for online CPU/Rx queue
Parameters
unsignedlongj- CPU/Rx queue index
constunsignedlong*online_mask- bitmask for CPUs/Rx queues that are online
unsignedintnr_bits- number of bits in the bitmask
Description
Returns true if a CPU/Rx queue is online.
- unsigned int
netif_attrmask_next(int n, const unsigned long * srcp, unsigned int nr_bits)¶ get the next CPU/Rx queue in a cpu/Rx queues mask
Parameters
intn- CPU/Rx queue index
constunsignedlong*srcp- the cpumask/Rx queue mask pointer
unsignedintnr_bits- number of bits in the bitmask
Description
Returns >= nr_bits if no further CPUs/Rx queues set.
- int
netif_attrmask_next_and(int n, const unsigned long * src1p, const unsigned long * src2p, unsigned int nr_bits)¶ get the next CPU/Rx queue in *src1p & *src2p
Parameters
intn- CPU/Rx queue index
constunsignedlong*src1p- the first CPUs/Rx queues mask pointer
constunsignedlong*src2p- the second CPUs/Rx queues mask pointer
unsignedintnr_bits- number of bits in the bitmask
Description
Returns >= nr_bits if no further CPUs/Rx queues set in both.
- bool
netif_is_multiqueue(const structnet_device * dev)¶ test if device has multiple transmit queues
Parameters
conststructnet_device*dev- network device
Description
Check if device has multiple transmit queues
- void
dev_put(structnet_device * dev)¶ release reference to device
Parameters
structnet_device*dev- network device
Description
Release reference to device to allow it to be freed.
- void
dev_hold(structnet_device * dev)¶ get reference to device
Parameters
structnet_device*dev- network device
Description
Hold reference to device to keep it from being freed.
- bool
netif_carrier_ok(const structnet_device * dev)¶ test if carrier present
Parameters
conststructnet_device*dev- network device
Description
Check if carrier is present on device
- void
netif_dormant_on(structnet_device * dev)¶ mark device as dormant.
Parameters
structnet_device*dev- network device
Description
Mark device as dormant (as per RFC2863).
The dormant state indicates that the relevant interface is notactually in a condition to pass packets (i.e., it is not ‘up’) but isin a “pending” state, waiting for some external event. For “on-demand” interfaces, this new state identifies the situation where theinterface is waiting for events to place it in the up state.
- void
netif_dormant_off(structnet_device * dev)¶ set device as not dormant.
Parameters
structnet_device*dev- network device
Description
Device is not in dormant state.
- bool
netif_dormant(const structnet_device * dev)¶ test if device is dormant
Parameters
conststructnet_device*dev- network device
Description
Check if device is dormant.
- void
netif_testing_on(structnet_device * dev)¶ mark device as under test.
Parameters
structnet_device*dev- network device
Description
Mark device as under test (as per RFC2863).
The testing state indicates that some test(s) must be performed onthe interface. After completion, of the test, the interface statewill change to up, dormant, or down, as appropriate.
- void
netif_testing_off(structnet_device * dev)¶ set device as not under test.
Parameters
structnet_device*dev- network device
Description
Device is not in testing state.
- bool
netif_testing(const structnet_device * dev)¶ test if device is under test
Parameters
conststructnet_device*dev- network device
Description
Check if device is under test
- bool
netif_oper_up(const structnet_device * dev)¶ test if device is operational
Parameters
conststructnet_device*dev- network device
Description
Check if carrier is operational
- bool
netif_device_present(structnet_device * dev)¶ is device available or removed
Parameters
structnet_device*dev- network device
Description
Check if device has not been removed from system.
- void
netif_tx_lock(structnet_device * dev)¶ grab network device transmit lock
Parameters
structnet_device*dev- network device
Description
Get network device transmit lock
- int
__dev_uc_sync(structnet_device * dev, int (*sync)(structnet_device *, const unsigned char *), int (*unsync) (structnet_device *, const unsigned char *))¶ Synchonize device’s unicast list
Parameters
structnet_device*dev- device to sync
int(*)(structnet_device*,constunsignedchar*)sync- function to call if address should be added
int(*)(structnet_device*,constunsignedchar*)unsyncfunction to call if address should be removed
Add newly added addresses to the interface, and releaseaddresses that have been deleted.
- void
__dev_uc_unsync(structnet_device * dev, int (*unsync)(structnet_device *, const unsigned char *))¶ Remove synchronized addresses from device
Parameters
structnet_device*dev- device to sync
int(*)(structnet_device*,constunsignedchar*)unsyncfunction to call if address should be removed
Remove all addresses that were added to the device by dev_uc_sync().
- int
__dev_mc_sync(structnet_device * dev, int (*sync)(structnet_device *, const unsigned char *), int (*unsync) (structnet_device *, const unsigned char *))¶ Synchonize device’s multicast list
Parameters
structnet_device*dev- device to sync
int(*)(structnet_device*,constunsignedchar*)sync- function to call if address should be added
int(*)(structnet_device*,constunsignedchar*)unsyncfunction to call if address should be removed
Add newly added addresses to the interface, and releaseaddresses that have been deleted.
- void
__dev_mc_unsync(structnet_device * dev, int (*unsync)(structnet_device *, const unsigned char *))¶ Remove synchronized addresses from device
Parameters
structnet_device*dev- device to sync
int(*)(structnet_device*,constunsignedchar*)unsyncfunction to call if address should be removed
Remove all addresses that were added to the device by dev_mc_sync().
PHY Support¶
- void
phy_print_status(struct phy_device * phydev)¶ Convenience function to print out the current phy status
Parameters
structphy_device*phydev- the phy_device struct
- int
phy_restart_aneg(struct phy_device * phydev)¶ restart auto-negotiation
Parameters
structphy_device*phydev- target phy_device struct
Description
Restart the autonegotiation onphydev. Returns >= 0 on success ornegative errno on error.
- int
phy_aneg_done(struct phy_device * phydev)¶ return auto-negotiation status
Parameters
structphy_device*phydev- target phy_device struct
Description
Return the auto-negotiation status from thisphydevReturns > 0 on success or < 0 on error. 0 means that auto-negotiationis still pending.
- int
phy_mii_ioctl(struct phy_device * phydev, struct ifreq * ifr, int cmd)¶ generic PHY MII ioctl interface
Parameters
structphy_device*phydev- the phy_device struct
structifreq*ifrstructifreqfor socket ioctl’sintcmd- ioctl cmd to execute
Description
Note that this function is currently incompatible with thePHYCONTROL layer. It changes registers without regard tocurrent state. Use at own risk.
- int
phy_do_ioctl(structnet_device * dev, struct ifreq * ifr, int cmd)¶ generic ndo_do_ioctl implementation
Parameters
structnet_device*dev- the net_device struct
structifreq*ifrstructifreqfor socket ioctl’sintcmd- ioctl cmd to execute
- int
phy_start_aneg(struct phy_device * phydev)¶ start auto-negotiation for this PHY device
Parameters
structphy_device*phydev- the phy_device struct
Description
- Sanitizes the settings (if we’re not autonegotiating
- them), and then calls the driver’s config_aneg function.If the PHYCONTROL Layer is operating, we change the state toreflect the beginning of Auto-negotiation or forcing.
- int
phy_speed_down(struct phy_device * phydev, bool sync)¶ set speed to lowest speed supported by both link partners
Parameters
structphy_device*phydev- the phy_device struct
boolsync- perform action synchronously
Description
Typically used to save energy when waiting for a WoL packet
WARNING: Setting sync to false may cause the system being unable to suspendin case the PHY generates an interrupt when finishing the autonegotiation.This interrupt may wake up the system immediately after suspend.Therefore use sync = false only if you’re sure it’s safe with the respectivenetwork chip.
- int
phy_speed_up(struct phy_device * phydev)¶ (re)set advertised speeds to all supported speeds
Parameters
structphy_device*phydev- the phy_device struct
Description
Used to revert the effect of phy_speed_down
- void
phy_start_machine(struct phy_device * phydev)¶ start PHY state machine tracking
Parameters
structphy_device*phydev- the phy_device struct
Description
- The PHY infrastructure can run a state machine
- which tracks whether the PHY is starting up, negotiating,etc. This function starts the delayed workqueue which tracksthe state of the PHY. If you want to maintain your own state machine,do not call this function.
- void
phy_request_interrupt(struct phy_device * phydev)¶ request and enable interrupt for a PHY device
Parameters
structphy_device*phydev- target phy_device struct
Description
- Request and enable the interrupt for the given PHY.
- If this fails, then we set irq to PHY_POLL.This should only be called with a valid IRQ number.
- void
phy_free_interrupt(struct phy_device * phydev)¶ disable and free interrupt for a PHY device
Parameters
structphy_device*phydev- target phy_device struct
Description
- Disable and free the interrupt for the given PHY.
- This should only be called with a valid IRQ number.
- void
phy_stop(struct phy_device * phydev)¶ Bring down the PHY link, and stop checking the status
Parameters
structphy_device*phydev- target phy_device struct
- void
phy_start(struct phy_device * phydev)¶ start or restart a PHY device
Parameters
structphy_device*phydev- target phy_device struct
Description
- Indicates the attached device’s readiness to
- handle PHY-related work. Used during startup to start thePHY, and after a call to
phy_stop()to resume operation.Also used to indicate the MDIO bus has cleared an errorcondition.
- void
phy_mac_interrupt(struct phy_device * phydev)¶ MAC says the link has changed
Parameters
structphy_device*phydev- phy_device struct with changed link
Description
The MAC layer is able to indicate there has been a change in the PHY linkstatus. Trigger the state machine and work a work queue.
- int
phy_init_eee(struct phy_device * phydev, bool clk_stop_enable)¶ init and check the EEE feature
Parameters
structphy_device*phydev- target phy_device struct
boolclk_stop_enable- PHY may stop the clock during LPI
Description
it checks if the Energy-Efficient Ethernet (EEE)is supported by looking at the MMD registers 3.20 and 7.60/61and it programs the MMD register 3.0 setting the “Clock stop enable”bit if required.
- int
phy_get_eee_err(struct phy_device * phydev)¶ report the EEE wake error count
Parameters
structphy_device*phydev- target phy_device struct
Description
it is to report the number of time where the PHYfailed to complete its normal wake sequence.
- int
phy_ethtool_get_eee(struct phy_device * phydev, struct ethtool_eee * data)¶ get EEE supported and status
Parameters
structphy_device*phydev- target phy_device struct
structethtool_eee*data- ethtool_eee data
Description
it reportes the Supported/Advertisement/LP Advertisementcapabilities.
- int
phy_ethtool_set_eee(struct phy_device * phydev, struct ethtool_eee * data)¶ set EEE supported and status
Parameters
structphy_device*phydev- target phy_device struct
structethtool_eee*data- ethtool_eee data
Description
it is to program the Advertisement EEE register.
- int
phy_clear_interrupt(struct phy_device * phydev)¶ Ack the phy device’s interrupt
Parameters
structphy_device*phydev- the phy_device struct
Description
If thephydev driver has an ack_interrupt function, call it toack and clear the phy device’s interrupt.
Returns 0 on success or < 0 on error.
- int
phy_config_interrupt(struct phy_device * phydev, bool interrupts)¶ configure the PHY device for the requested interrupts
Parameters
structphy_device*phydev- the phy_device struct
boolinterrupts- interrupt flags to configure for thisphydev
Description
Returns 0 on success or < 0 on error.
- const struct phy_setting *
phy_find_valid(int speed, int duplex, unsigned long * supported)¶ find a PHY setting that matches the requested parameters
Parameters
intspeed- desired speed
intduplex- desired duplex
unsignedlong*supported- mask of supported link modes
Description
Locate a supported phy setting that is, in priority order:- an exact match for the specified speed and duplex mode- a match for the specified speed, or slower speed- the slowest supported speedReturns the matched phy_setting entry, orNULL if no supported physettings were found.
- unsigned int
phy_supported_speeds(struct phy_device * phy, unsigned int * speeds, unsigned int size)¶ return all speeds currently supported by a phy device
Parameters
structphy_device*phy- The phy device to return supported speeds of.
unsignedint*speeds- buffer to store supported speeds in.
unsignedintsize- size of speeds buffer.
Description
Returns the number of supported speeds, and fills the speedsbuffer with the supported speeds. If speeds buffer is too small to containall currently supported speeds, will return as many speeds as can fit.
- bool
phy_check_valid(int speed, int duplex, unsigned long * features)¶ check if there is a valid PHY setting which matches speed, duplex, and feature mask
Parameters
intspeed- speed to match
intduplex- duplex to match
unsignedlong*features- A mask of the valid settings
Description
Returns true if there is a valid setting, false otherwise.
- void
phy_sanitize_settings(struct phy_device * phydev)¶ make sure the PHY is set to supported speed and duplex
Parameters
structphy_device*phydev- the target phy_device struct
Description
- Make sure the PHY is set to supported speeds and
- duplexes. Drop down by one in this order: 1000/FULL,1000/HALF, 100/FULL, 100/HALF, 10/FULL, 10/HALF.
- int
phy_check_link_status(struct phy_device * phydev)¶ check link status and set state accordingly
Parameters
structphy_device*phydev- the phy_device struct
Description
Check for link and whether autoneg was triggered / is runningand set state accordingly
- void
phy_stop_machine(struct phy_device * phydev)¶ stop the PHY state machine tracking
Parameters
structphy_device*phydev- target phy_device struct
Description
- Stops the state machine delayed workqueue, sets the
- state to UP (unless it wasn’t up yet). This function must becalled BEFORE phy_detach.
- void
phy_error(struct phy_device * phydev)¶ enter HALTED state for this PHY device
Parameters
structphy_device*phydev- target phy_device struct
Description
Moves the PHY to the HALTED state in response to a reador write error, and tells the controller the link is down.Must not be called from interrupt context, or while thephydev->lock is held.
- int
phy_disable_interrupts(struct phy_device * phydev)¶ Disable the PHY interrupts from the PHY side
Parameters
structphy_device*phydev- target phy_device struct
- irqreturn_t
phy_interrupt(int irq, void * phy_dat)¶ PHY interrupt handler
Parameters
intirq- interrupt line
void*phy_dat- phy_device pointer
Description
Handle PHY interrupt
- int
phy_enable_interrupts(struct phy_device * phydev)¶ Enable the interrupts from the PHY side
Parameters
structphy_device*phydev- target phy_device struct
- void
phy_state_machine(struct work_struct * work)¶ Handle the state machine
Parameters
structwork_struct*work- work_struct that describes the work to be done
- int
phy_register_fixup(const char * bus_id, u32 phy_uid, u32 phy_uid_mask, int (*run)(struct phy_device *))¶ creates a new phy_fixup and adds it to the list
Parameters
constchar*bus_id- A string which matches phydev->mdio.dev.bus_id (or PHY_ANY_ID)
u32phy_uid- Used to match against phydev->phy_id (the UID of the PHY)It can also be PHY_ANY_UID
u32phy_uid_mask- Applied to phydev->phy_id and fixup->phy_uid beforecomparison
int(*)(structphy_device*)run- The actual code to be run when a matching PHY is found
- int
phy_unregister_fixup(const char * bus_id, u32 phy_uid, u32 phy_uid_mask)¶ remove a phy_fixup from the list
Parameters
constchar*bus_id- A string matches fixup->bus_id (or PHY_ANY_ID) in phy_fixup_list
u32phy_uid- A phy id matches fixup->phy_id (or PHY_ANY_UID) in phy_fixup_list
u32phy_uid_mask- Applied to phy_uid and fixup->phy_uid before comparison
- struct phy_device *
get_phy_device(struct mii_bus * bus, int addr, bool is_c45)¶ reads the specified PHY device and returns itsphy_device struct
Parameters
structmii_bus*bus- the target MII bus
intaddr- PHY address on the MII bus
boolis_c45- If true the PHY uses the 802.3 clause 45 protocol
Description
Probe for a PHY ataddr onbus.
When probing for a clause 22 PHY, then read the ID registers. If we finda valid ID, allocate and return astructphy_device.
When probing for a clause 45 PHY, read the “devices in package” registers.If the “devices in package” appears valid, read the ID registers for eachMMD, allocate and return astructphy_device.
Returns an allocatedstructphy_device on success,-ENODEV if there isno PHY present, or-EIO on bus access error.
- int
phy_device_register(struct phy_device * phydev)¶ Register the phy device on the MDIO bus
Parameters
structphy_device*phydev- phy_device structure to be added to the MDIO bus
- void
phy_device_remove(struct phy_device * phydev)¶ Remove a previously registered phy device from the MDIO bus
Parameters
structphy_device*phydev- phy_device structure to remove
Description
This doesn’t free the phy_device itself, it merely reverses the effectsofphy_device_register(). Use phy_device_free() to free the deviceafter calling this function.
- struct phy_device *
phy_find_first(struct mii_bus * bus)¶ finds the first PHY device on the bus
Parameters
structmii_bus*bus- the target MII bus
- int
phy_connect_direct(structnet_device * dev, struct phy_device * phydev, void (*handler)(structnet_device *), phy_interface_t interface)¶ connect an ethernet device to a specific phy_device
Parameters
structnet_device*dev- the network device to connect
structphy_device*phydev- the pointer to the phy device
void(*)(structnet_device*)handler- callback function for state change notifications
phy_interface_tinterface- PHY device’s interface
- struct phy_device *
phy_connect(structnet_device * dev, const char * bus_id, void (*handler)(structnet_device *), phy_interface_t interface)¶ connect an ethernet device to a PHY device
Parameters
structnet_device*dev- the network device to connect
constchar*bus_id- the id string of the PHY device to connect
void(*)(structnet_device*)handler- callback function for state change notifications
phy_interface_tinterface- PHY device’s interface
Description
- Convenience function for connecting ethernet
- devices to PHY devices. The default behavior is forthe PHY infrastructure to handle everything, and only notifythe connected driver when the link status changes. If youdon’t want, or can’t use the provided functionality, you maychoose to call only the subset of functions which providethe desired functionality.
- void
phy_disconnect(struct phy_device * phydev)¶ disable interrupts, stop state machine, and detach a PHY device
Parameters
structphy_device*phydev- target phy_device struct
- void
phy_sfp_attach(void * upstream, structsfp_bus * bus)¶ attach the SFP bus to the PHY upstream network device
Parameters
void*upstream- pointer to the phy device
structsfp_bus*bus- sfp bus representing cage being attached
Description
This is used to fill in the sfp_upstream_ops .attach member.
- void
phy_sfp_detach(void * upstream, structsfp_bus * bus)¶ detach the SFP bus from the PHY upstream network device
Parameters
void*upstream- pointer to the phy device
structsfp_bus*bus- sfp bus representing cage being attached
Description
This is used to fill in the sfp_upstream_ops .detach member.
- int
phy_sfp_probe(struct phy_device * phydev, const structsfp_upstream_ops * ops)¶ probe for a SFP cage attached to this PHY device
Parameters
structphy_device*phydev- Pointer to phy_device
conststructsfp_upstream_ops*ops- SFP’s upstream operations
- int
phy_attach_direct(structnet_device * dev, struct phy_device * phydev, u32 flags, phy_interface_t interface)¶ attach a network device to a given PHY device pointer
Parameters
structnet_device*dev- network device to attach
structphy_device*phydev- Pointer to phy_device to attach
u32flags- PHY device’s dev_flags
phy_interface_tinterface- PHY device’s interface
Description
- Called by drivers to attach to a particular PHY
- device. The phy_device is found, and properly hooked upto the phy_driver. If no driver is attached, then ageneric driver is used. The phy_device is given a ptr tothe attaching device, and given a callback for link statuschange. The phy_device is returned to the attaching driver.This function takes a reference on the phy device.
- struct phy_device *
phy_attach(structnet_device * dev, const char * bus_id, phy_interface_t interface)¶ attach a network device to a particular PHY device
Parameters
structnet_device*dev- network device to attach
constchar*bus_id- Bus ID of PHY device to attach
phy_interface_tinterface- PHY device’s interface
Description
- Same as phy_attach_direct() except that a PHY bus_id
- string is passed instead of a pointer to a struct phy_device.
- int
phy_package_join(struct phy_device * phydev, int addr, size_t priv_size)¶ join a common PHY group
Parameters
structphy_device*phydev- target phy_device struct
intaddr- cookie and PHY address for global register access
size_tpriv_size- if non-zero allocate this amount of bytes for private data
Description
This joins a PHY group and provides a shared storage for all phydevs inthis group. This is intended to be used for packages which containmore than one PHY, for example a quad PHY transceiver.
The addr parameter serves as a cookie which has to have the same valuefor all members of one group and as a PHY address to access genericregisters of a PHY package. Usually, one of the PHY addresses of thedifferent PHYs in the package provides access to these global registers.The address which is given here, will be used in the phy_package_read()and phy_package_write() convenience functions. If your PHY doesn’t haveglobal registers you can just pick any of the PHY addresses.
This will set the shared pointer of the phydev to the shared storage.If this is the first call for a this cookie the shared storage will beallocated. If priv_size is non-zero, the given amount of bytes areallocated for the priv member.
Returns < 1 on error, 0 on success. Esp. callingphy_package_join()with the same cookie but a different priv_size is an error.
- void
phy_package_leave(struct phy_device * phydev)¶ leave a common PHY group
Parameters
structphy_device*phydev- target phy_device struct
Description
This leaves a PHY group created byphy_package_join(). If this phydevwas the last user of the shared data between the group, this data isfreed. Resets the phydev->shared pointer to NULL.
- int
devm_phy_package_join(structdevice * dev, struct phy_device * phydev, int addr, size_t priv_size)¶ resource managed
phy_package_join()
Parameters
structdevice*dev- device that is registering this PHY package
structphy_device*phydev- target phy_device struct
intaddr- cookie and PHY address for global register access
size_tpriv_size- if non-zero allocate this amount of bytes for private data
Description
Managedphy_package_join(). Shared storage fetched by this function,phy_package_leave() is automatically called on driver detach. Seephy_package_join() for more information.
- void
phy_detach(struct phy_device * phydev)¶ detach a PHY device from its network device
Parameters
structphy_device*phydev- target phy_device struct
Description
This detaches the phy device from its network device and the phydriver, and drops the reference count taken inphy_attach_direct().
- int
phy_reset_after_clk_enable(struct phy_device * phydev)¶ perform a PHY reset if needed
Parameters
structphy_device*phydev- target phy_device struct
Description
- Some PHYs are known to need a reset after their refclk was
- enabled. This function evaluates the flags and perform the reset if it’sneeded. Returns < 0 on error, 0 if the phy wasn’t reset and 1 if the phywas reset.
- int
genphy_config_eee_advert(struct phy_device * phydev)¶ disable unwanted eee mode advertisement
Parameters
structphy_device*phydev- target phy_device struct
Description
- Writes MDIO_AN_EEE_ADV after disabling unsupported energy
- efficent ethernet modes. Returns 0 if the PHY’s advertisement hasn’tchanged, and 1 if it has changed.
- int
genphy_setup_forced(struct phy_device * phydev)¶ configures/forces speed/duplex fromphydev
Parameters
structphy_device*phydev- target phy_device struct
Description
- Configures MII_BMCR to force speed/duplex
- to the values in phydev. Assumes that the values are valid.Please see
phy_sanitize_settings().
- int
genphy_restart_aneg(struct phy_device * phydev)¶ Enable and Restart Autonegotiation
Parameters
structphy_device*phydev- target phy_device struct
- int
genphy_check_and_restart_aneg(struct phy_device * phydev, bool restart)¶ Enable and restart auto-negotiation
Parameters
structphy_device*phydev- target phy_device struct
boolrestart- whether aneg restart is requested
Description
Check, and restart auto-negotiation if needed.
- int
__genphy_config_aneg(struct phy_device * phydev, bool changed)¶ restart auto-negotiation or write BMCR
Parameters
structphy_device*phydev- target phy_device struct
boolchanged- whether autoneg is requested
Description
- If auto-negotiation is enabled, we configure the
- advertising, and then restart auto-negotiation. If it is notenabled, then we write the BMCR.
- int
genphy_c37_config_aneg(struct phy_device * phydev)¶ restart auto-negotiation or write BMCR
Parameters
structphy_device*phydev- target phy_device struct
Description
- If auto-negotiation is enabled, we configure the
- advertising, and then restart auto-negotiation. If it is notenabled, then we write the BMCR. This function is intendedfor use with Clause 37 1000Base-X mode.
- int
genphy_aneg_done(struct phy_device * phydev)¶ return auto-negotiation status
Parameters
structphy_device*phydev- target phy_device struct
Description
- Reads the status register and returns 0 either if
- auto-negotiation is incomplete, or if there was an error.Returns BMSR_ANEGCOMPLETE if auto-negotiation is done.
- int
genphy_update_link(struct phy_device * phydev)¶ update link status inphydev
Parameters
structphy_device*phydev- target phy_device struct
Description
- Update the value in phydev->link to reflect the
- current link value. In order to do this, we need to readthe status register twice, keeping the second value.
- int
genphy_read_status_fixed(struct phy_device * phydev)¶ read the link parameters for !aneg mode
Parameters
structphy_device*phydev- target phy_device struct
Description
Read the current duplex and speed state for a PHY operating withautonegotiation disabled.
- int
genphy_read_status(struct phy_device * phydev)¶ check the link status and update current link state
Parameters
structphy_device*phydev- target phy_device struct
Description
- Check the link, then figure out the current state
- by comparing what we advertise with what the link partneradvertises. Start by checking the gigabit possibilities,then move on to 10/100.
- int
genphy_c37_read_status(struct phy_device * phydev)¶ check the link status and update current link state
Parameters
structphy_device*phydev- target phy_device struct
Description
- Check the link, then figure out the current state
- by comparing what we advertise with what the link partneradvertises. This function is for Clause 37 1000Base-X mode.
- int
genphy_soft_reset(struct phy_device * phydev)¶ software reset the PHY via BMCR_RESET bit
Parameters
structphy_device*phydev- target phy_device struct
Description
Perform a software PHY reset using the standardBMCR_RESET bit and poll for the reset bit to be cleared.
Return
0 on success, < 0 on failure
- int
genphy_read_abilities(struct phy_device * phydev)¶ read PHY abilities from Clause 22 registers
Parameters
structphy_device*phydev- target phy_device struct
Description
Reads the PHY’s abilities and populatesphydev->supported accordingly.
Return
0 on success, < 0 on failure
- void
phy_remove_link_mode(struct phy_device * phydev, u32 link_mode)¶ Remove a supported link mode
Parameters
structphy_device*phydev- phy_device structure to remove link mode from
u32link_mode- Link mode to be removed
Description
Some MACs don’t support all link modes which the PHYdoes. e.g. a 1G MAC often does not support 1000Half. Add a helperto remove a link mode.
- void
phy_advertise_supported(struct phy_device * phydev)¶ Advertise all supported modes
Parameters
structphy_device*phydev- target phy_device struct
Description
Called to advertise all supported modes, doesn’t touchpause mode advertising.
- void
phy_support_sym_pause(struct phy_device * phydev)¶ Enable support of symmetrical pause
Parameters
structphy_device*phydev- target phy_device struct
Description
Called by the MAC to indicate is supports symmetricalPause, but not asym pause.
- void
phy_support_asym_pause(struct phy_device * phydev)¶ Enable support of asym pause
Parameters
structphy_device*phydev- target phy_device struct
Description
Called by the MAC to indicate is supports Asym Pause.
- void
phy_set_sym_pause(struct phy_device * phydev, bool rx, bool tx, bool autoneg)¶ Configure symmetric Pause
Parameters
structphy_device*phydev- target phy_device struct
boolrx- Receiver Pause is supported
booltx- Transmit Pause is supported
boolautoneg- Auto neg should be used
Description
Configure advertised Pause support depending on ifreceiver pause and pause auto neg is supported. Generally calledfrom the set_pauseparam .ndo.
- void
phy_set_asym_pause(struct phy_device * phydev, bool rx, bool tx)¶ Configure Pause and Asym Pause
Parameters
structphy_device*phydev- target phy_device struct
boolrx- Receiver Pause is supported
booltx- Transmit Pause is supported
Description
Configure advertised Pause support depending on iftransmit and receiver pause is supported. If there has been achange in adverting, trigger a new autoneg. Generally called fromthe set_pauseparam .ndo.
- bool
phy_validate_pause(struct phy_device * phydev, struct ethtool_pauseparam * pp)¶ Test if the PHY/MAC support the pause configuration
Parameters
structphy_device*phydev- phy_device struct
structethtool_pauseparam*pp- requested pause configuration
Description
Test if the PHY/MAC combination supports the Pauseconfiguration the user is requesting. Returns True if it issupported, false otherwise.
- void
phy_get_pause(struct phy_device * phydev, bool * tx_pause, bool * rx_pause)¶ resolve negotiated pause modes
Parameters
structphy_device*phydev- phy_device struct
bool*tx_pause- pointer to bool to indicate whether transmit pause should beenabled.
bool*rx_pause- pointer to bool to indicate whether receive pause should beenabled.
Description
Resolve and return the flow control modes according to the negotiationresult. This includes checking that we are operating in full duplex mode.See linkmode_resolve_pause() for further details.
- s32
phy_get_internal_delay(struct phy_device * phydev, structdevice * dev, const int * delay_values, int size, bool is_rx)¶ returns the index of the internal delay
Parameters
structphy_device*phydev- phy_device struct
structdevice*dev- pointer to the devices device struct
constint*delay_values- array of delays the PHY supports
intsize- the size of the delay array
boolis_rx- boolean to indicate to get the rx internal delay
Description
Returns the index within the array of internal delay passed in.If the device property is not present then the interface type is checkedif the interface defines use of internal delay then a 1 is returned otherwisea 0 is returned.The array must be in ascending order. If PHY does not have an ascending orderarray then size = 0 and the value of the delay property is returned.Return -EINVAL if the delay is invalid or cannot be found.
- int
phy_driver_register(struct phy_driver * new_driver, struct module * owner)¶ register a phy_driver with the PHY layer
Parameters
structphy_driver*new_driver- new phy_driver to register
structmodule*owner- module owning this PHY
- int
get_phy_c45_ids(struct mii_bus * bus, int addr, struct phy_c45_device_ids * c45_ids)¶ reads the specified addr for its 802.3-c45 IDs.
Parameters
structmii_bus*bus- the target MII bus
intaddr- PHY address on the MII bus
structphy_c45_device_ids*c45_ids- where to store the c45 ID information.
Description
Read the PHY “devices in package”. If this appears to be valid, readthe PHY identifiers for each device. Return the “devices in package”and identifiers inc45_ids.
Returns zero on success,-EIO on bus access error, or-ENODEV ifthe “devices in package” is invalid.
- int
get_phy_c22_id(struct mii_bus * bus, int addr, u32 * phy_id)¶ reads the specified addr for its clause 22 ID.
Parameters
structmii_bus*bus- the target MII bus
intaddr- PHY address on the MII bus
u32*phy_id- where to store the ID retrieved.
Description
Read the 802.3 clause 22 PHY ID from the PHY ataddr on thebus,placing it inphy_id. Return zero on successful read and the ID isvalid,-EIO on bus access error, or-ENODEV if no device respondsor invalid ID.
- void
phy_prepare_link(struct phy_device * phydev, void (*handler)(structnet_device *))¶ prepares the PHY layer to monitor link status
Parameters
structphy_device*phydev- target phy_device struct
void(*)(structnet_device*)handler- callback function for link status change notifications
Description
- Tells the PHY infrastructure to handle the
- gory details on monitoring link status (whether throughpolling or an interrupt), and to call back to theconnected device driver when the link status changes.If you want to monitor your own link state, don’t callthis function.
- int
phy_poll_reset(struct phy_device * phydev)¶ Safely wait until a PHY reset has properly completed
Parameters
structphy_device*phydev- The PHY device to poll
Description
- According to IEEE 802.3, Section 2, Subsection 22.2.4.1.1, as
published in 2008, a PHY reset may take up to 0.5 seconds. The MII BMCRregister must be polled until the BMCR_RESET bit clears.
Furthermore, any attempts to write to PHY registers may have no effector even generate MDIO bus errors until this is complete.
Some PHYs (such as the Marvell 88E1111) don’t entirely conform to thestandard and do not fully reset after the BMCR_RESET bit is set, and mayevenREQUIRE a soft-reset to properly restart autonegotiation. In aneffort to support such broken PHYs, this function is separate from thestandard phy_init_hw() which will zero all the other bits in the BMCRand reapply all driver-specific and board-specific fixups.
- int
genphy_config_advert(struct phy_device * phydev)¶ sanitize and advertise auto-negotiation parameters
Parameters
structphy_device*phydev- target phy_device struct
Description
- Writes MII_ADVERTISE with the appropriate values,
- after sanitizing the values to make sure we only advertisewhat is supported. Returns < 0 on error, 0 if the PHY’s advertisementhasn’t changed, and > 0 if it has changed.
- int
genphy_c37_config_advert(struct phy_device * phydev)¶ sanitize and advertise auto-negotiation parameters
Parameters
structphy_device*phydev- target phy_device struct
Description
- Writes MII_ADVERTISE with the appropriate values,
- after sanitizing the values to make sure we only advertisewhat is supported. Returns < 0 on error, 0 if the PHY’s advertisementhasn’t changed, and > 0 if it has changed. This function is intendedfor Clause 37 1000Base-X mode.
Parameters
structdevice*dev- device to probe and init
Description
- Take care of setting up the phy_device structure,
- set the state to READY (the driver’s init function shouldset it to STARTING if needed).
- struct mii_bus *
mdiobus_alloc_size(size_t size)¶ allocate a mii_bus structure
Parameters
size_tsize- extra amount of memory to allocate for private storage.If non-zero, then bus->priv is points to that memory.
Description
called by a bus driver to allocate an mii_busstructure to fill in.
- struct mii_bus *
mdio_find_bus(const char * mdio_name)¶ Given the name of a mdiobus, find the mii_bus.
Parameters
constchar*mdio_name- The name of a mdiobus.
Description
Returns a reference to the mii_bus, or NULL if none found. Theembedded struct device will have its reference count incremented,and this must be put_deviced’ed once the bus is finished with.
- struct mii_bus *
of_mdio_find_bus(struct device_node * mdio_bus_np)¶ Given an mii_bus node, find the mii_bus.
Parameters
structdevice_node*mdio_bus_np- Pointer to the mii_bus.
Description
Returns a reference to the mii_bus, or NULL if none found. Theembedded struct device will have its reference count incremented,and this must be put once the bus is finished with.
Because the association of a device_node and mii_bus is made viaof_mdiobus_register(), the mii_bus cannot be found before it isregistered with of_mdiobus_register().
- int
__mdiobus_register(struct mii_bus * bus, struct module * owner)¶ bring up all the PHYs on a given bus and attach them to bus
Parameters
structmii_bus*bus- target mii_bus
structmodule*owner- module containing bus accessor functions
Description
- Called by a bus driver to bring up all the PHYs
- on a given bus, and attach them to the bus. Drivers should usemdiobus_register() rather than
__mdiobus_register()unless theyneed to pass a specific owner module. MDIO devices which are notPHYs will not be brought up by this function. They are expected toto be explicitly listed in DT and instantiated by of_mdiobus_register().
Returns 0 on success or < 0 on error.
- void
mdiobus_free(struct mii_bus * bus)¶ free a struct mii_bus
Parameters
structmii_bus*bus- mii_bus to free
Description
This function releases the reference to the underlying deviceobject in the mii_bus. If this is the last reference, the mii_buswill be freed.
- struct phy_device *
mdiobus_scan(struct mii_bus * bus, int addr)¶ scan a bus for MDIO devices.
Parameters
structmii_bus*bus- mii_bus to scan
intaddr- address on bus to scan
Description
This function scans the MDIO bus, looking for devices which can beidentified using a vendor/product ID in registers 2 and 3. Not allMDIO devices have such registers, but PHY devices typicallydo. Hence this function assumes anything found is a PHY, or can betreated as a PHY. Other MDIO devices, such as switches, willprobably not be found during the scan.
- int
__mdiobus_read(struct mii_bus * bus, int addr, u32 regnum)¶ Unlocked version of the mdiobus_read function
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to read
Description
Read a MDIO bus register. Caller must hold the mdio bus lock.
NOTE
MUST NOT be called from interrupt context.
- int
__mdiobus_write(struct mii_bus * bus, int addr, u32 regnum, u16 val)¶ Unlocked version of the mdiobus_write function
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to write
u16val- value to write toregnum
Description
Write a MDIO bus register. Caller must hold the mdio bus lock.
NOTE
MUST NOT be called from interrupt context.
- int
__mdiobus_modify_changed(struct mii_bus * bus, int addr, u32 regnum, u16 mask, u16 set)¶ Unlocked version of the mdiobus_modify function
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to modify
u16mask- bit mask of bits to clear
u16set- bit mask of bits to set
Description
Read, modify, and if any change, write the register value back to thedevice. Any error returns a negative number.
NOTE
MUST NOT be called from interrupt context.
- int
mdiobus_read_nested(struct mii_bus * bus, int addr, u32 regnum)¶ Nested version of the mdiobus_read function
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to read
Description
In case of nested MDIO bus access avoid lockdep false positives byusing mutex_lock_nested().
NOTE
MUST NOT be called from interrupt context,because the bus read/write functions may wait for an interruptto conclude the operation.
- int
mdiobus_read(struct mii_bus * bus, int addr, u32 regnum)¶ Convenience function for reading a given MII mgmt register
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to read
NOTE
MUST NOT be called from interrupt context,because the bus read/write functions may wait for an interruptto conclude the operation.
- int
mdiobus_write_nested(struct mii_bus * bus, int addr, u32 regnum, u16 val)¶ Nested version of the mdiobus_write function
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to write
u16val- value to write toregnum
Description
In case of nested MDIO bus access avoid lockdep false positives byusing mutex_lock_nested().
NOTE
MUST NOT be called from interrupt context,because the bus read/write functions may wait for an interruptto conclude the operation.
- int
mdiobus_write(struct mii_bus * bus, int addr, u32 regnum, u16 val)¶ Convenience function for writing a given MII mgmt register
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to write
u16val- value to write toregnum
NOTE
MUST NOT be called from interrupt context,because the bus read/write functions may wait for an interruptto conclude the operation.
- int
mdiobus_modify(struct mii_bus * bus, int addr, u32 regnum, u16 mask, u16 set)¶ Convenience function for modifying a given mdio device register
Parameters
structmii_bus*bus- the mii_bus struct
intaddr- the phy address
u32regnum- register number to write
u16mask- bit mask of bits to clear
u16set- bit mask of bits to set
Parameters
structdevice*d- the target struct device that contains the mii_bus
Description
called when the last reference to an mii_bus isdropped, to free the underlying memory.
- int
mdiobus_create_device(struct mii_bus * bus, struct mdio_board_info * bi)¶ create a full MDIO device given a mdio_board_info structure
Parameters
structmii_bus*bus- MDIO bus to create the devices on
structmdio_board_info*bi- mdio_board_info structure describing the devices
Description
Returns 0 on success or < 0 on error.
- int
mdio_bus_match(structdevice * dev, struct device_driver * drv)¶ determine if given MDIO driver supports the given MDIO device
Parameters
structdevice*dev- target MDIO device
structdevice_driver*drv- given MDIO driver
Description
- Given a MDIO device, and a MDIO driver, return 1 if
- the driver supports the device. Otherwise, return 0. This mayrequire calling the devices own match function, since different classesof MDIO devices have different match criteria.
PHYLINK¶
PHYLINK interfaces traditional network drivers with PHYLIB, fixed-links,and SFF modules (eg, hot-pluggable SFP) that may contain PHYs. PHYLINKprovides management of the link state and link modes.
- struct
phylink_link_state¶ link state structure
Definition
struct phylink_link_state { unsigned long advertising[BITS_TO_LONGS(__ETHTOOL_LINK_MODE_MASK_NBITS)]; unsigned long lp_advertising[BITS_TO_LONGS(__ETHTOOL_LINK_MODE_MASK_NBITS)]; phy_interface_t interface; int speed; int duplex; int pause; unsigned int link:1; unsigned int an_enabled:1; unsigned int an_complete:1;};Members
advertising- ethtool bitmask containing advertised link modes
lp_advertising- ethtool bitmask containing link partner advertised linkmodes
interface- link
typedefphy_interface_tmode speed- link speed, one of the SPEED_* constants.
duplex- link duplex mode, one of DUPLEX_* constants.
pause- link pause state, described by MLO_PAUSE_* constants.
link- true if the link is up.
an_enabled- true if autonegotiation is enabled/desired.
an_complete- true if autonegotiation has completed.
- struct
phylink_config¶ PHYLINK configuration structure
Definition
struct phylink_config { struct device *dev; enum phylink_op_type type; bool pcs_poll; bool poll_fixed_state; void (*get_fixed_state)(struct phylink_config *config, struct phylink_link_state *state);};Members
dev- a pointer to a struct device associated with the MAC
type- operation type of PHYLINK instance
pcs_poll- MAC PCS cannot provide link change interrupt
poll_fixed_state- if true, starts link_poll,if MAC link is at
MLO_AN_FIXEDmode. get_fixed_state- callback to execute to determine the fixed link state,if MAC link is at
MLO_AN_FIXEDmode.
- struct
phylink_mac_ops¶ MAC operations structure.
Definition
struct phylink_mac_ops { void (*validate)(struct phylink_config *config,unsigned long *supported, struct phylink_link_state *state); void (*mac_pcs_get_state)(struct phylink_config *config, struct phylink_link_state *state); int (*mac_prepare)(struct phylink_config *config, unsigned int mode, phy_interface_t iface); void (*mac_config)(struct phylink_config *config, unsigned int mode, const struct phylink_link_state *state); int (*mac_finish)(struct phylink_config *config, unsigned int mode, phy_interface_t iface); void (*mac_an_restart)(struct phylink_config *config); void (*mac_link_down)(struct phylink_config *config, unsigned int mode, phy_interface_t interface); void (*mac_link_up)(struct phylink_config *config,struct phy_device *phy, unsigned int mode,phy_interface_t interface, int speed, int duplex, bool tx_pause, bool rx_pause);};Members
validate- Validate and update the link configuration.
mac_pcs_get_state- Read the current link state from the hardware.
mac_prepare- prepare for a major reconfiguration of the interface.
mac_config- configure the MAC for the selected mode and state.
mac_finish- finish a major reconfiguration of the interface.
mac_an_restart- restart 802.3z BaseX autonegotiation.
mac_link_down- take the link down.
mac_link_up- allow the link to come up.
Description
The individual methods are described more fully below.
- void
validate(structphylink_config * config, unsigned long * supported, structphylink_link_state * state)¶ Validate and update the link configuration
Parameters
structphylink_config*config- a pointer to a
structphylink_config. unsignedlong*supported- ethtool bitmask for supported link modes.
structphylink_link_state*state- a pointer to a
structphylink_link_state.
Description
Clear bits in thesupported andstate->advertising masks thatare not supportable by the MAC.
Note that the PHY may be able to transform from one connectiontechnology to another, so, eg, don’t clear 1000BaseX justbecause the MAC is unable to BaseX mode. This is more aboutclearing unsupported speeds and duplex settings. The port modesshould not be cleared;phylink_set_port_modes() will help with this.
If thestate->interface mode isPHY_INTERFACE_MODE_1000BASEXorPHY_INTERFACE_MODE_2500BASEX, select the appropriate modebased onstate->advertising and/orstate->speed and updatestate->interface accordingly. Seephylink_helper_basex_speed().
Whenstate->interface isPHY_INTERFACE_MODE_NA, phylink expects theMAC driver to return all supported link modes.
If thestate->interface mode is not supported, then thesupportedmask must be cleared.
- void
mac_pcs_get_state(structphylink_config * config, structphylink_link_state * state)¶ Read the current inband link state from the hardware
Parameters
structphylink_config*config- a pointer to a
structphylink_config. structphylink_link_state*state- a pointer to a
structphylink_link_state.
Description
Read the current inband link state from the MAC PCS, reporting thecurrent speed instate->speed, duplex mode instate->duplex, pausemode instate->pause using theMLO_PAUSE_RX andMLO_PAUSE_TX bits,negotiation completion state instate->an_complete, and link up stateinstate->link. If possible,state->lp_advertising should also bepopulated.
- int
mac_prepare(structphylink_config * config, unsigned int mode, phy_interface_t iface)¶ prepare to change the PHY interface mode
Parameters
structphylink_config*config- a pointer to a
structphylink_config. unsignedintmode- one of
MLO_AN_FIXED,MLO_AN_PHY,MLO_AN_INBAND. phy_interface_tiface- interface mode to switch to
Description
phylink will call this method at the beginning of a full initialisationof the link, which includes changing the interface mode or at initialstartup time. It may be called for the current mode. The MAC drivershould perform whatever actions are required, e.g. disabling theSerdes PHY.
This will be the first call in the sequence:-mac_prepare()-mac_config()-pcs_config()- possiblepcs_an_restart()-mac_finish()
Returns zero on success, or negative errno on failure which will bereported to the kernel log.
- void
mac_config(structphylink_config * config, unsigned int mode, const structphylink_link_state * state)¶ configure the MAC for the selected mode and state
Parameters
structphylink_config*config- a pointer to a
structphylink_config. unsignedintmode- one of
MLO_AN_FIXED,MLO_AN_PHY,MLO_AN_INBAND. conststructphylink_link_state*state- a pointer to a
structphylink_link_state.
Description
Note - not all members ofstate are valid. In particular,state->lp_advertising,state->link,state->an_complete are neverguaranteed to be correct, and so anymac_config() implementation mustnever reference these fields.
- (this requires a rewrite - please refer to mac_link_up() for situations
- where the PCS and MAC are not tightly integrated.)
In all negotiation modes, as defined bymode,state->pause indicates thepause settings which should be applied as follows. IfMLO_PAUSE_AN is notset,MLO_PAUSE_TX andMLO_PAUSE_RX indicate whether the MAC should sendpause frames and/or act on received pause frames respectively. Otherwise,the results of in-band negotiation/status from the MAC PCS should be usedto control the MAC pause mode settings.
The action performed depends on the currently selected mode:
MLO_AN_FIXED,MLO_AN_PHY:Configure for non-inband negotiation mode, where the link settingsare completely communicated via
mac_link_up(). The physical linkprotocol from the MAC is specified bystate->interface.state->advertising may be used, but is not required.
Older drivers (prior to the
mac_link_up()change) may usestate->speed,state->duplex andstate->pause to configure the MAC, but this isdeprecated; such drivers should be converted to usemac_link_up().Other members ofstate must be ignored.
Valid state members: interface, advertising.Deprecated state members: speed, duplex, pause.
MLO_AN_INBAND:place the link in an inband negotiation mode (such as 802.3z1000base-X or Cisco SGMII mode depending on thestate->interfacemode). In both cases, link state management (whether the linkis up or not) is performed by the MAC, and reported via the
mac_pcs_get_state()callback. Changes in link state must be madeby callingphylink_mac_change().Interface mode specific details are mentioned below.
If in 802.3z mode, the link speed is fixed, dependent on thestate->interface. Duplex and pause modes are negotiated viathe in-band configuration word. Advertised pause modes are setaccording to thestate->an_enabled andstate->advertisingflags. Beware of MACs which only support full duplex at gigabitand higher speeds.
If in Cisco SGMII mode, the link speed and duplex mode are passedin the serial bitstream 16-bit configuration word, and the MACshould be configured to read these bits and acknowledge theconfiguration word. Nothing is advertised by the MAC. The MAC isresponsible for reading the configuration word and configuringitself accordingly.
Valid state members: interface, an_enabled, pause, advertising.
Implementations are expected to update the MAC to reflect therequested settings - i.o.w., if nothing has changed between twocalls, no action is expected. If only flow control settings havechanged, flow control should be updatedwithout taking the linkdown. This “update” behaviour is critical to avoid bouncing thelink up status.
- int
mac_finish(structphylink_config * config, unsigned int mode, phy_interface_t iface)¶ finish a to change the PHY interface mode
Parameters
structphylink_config*config- a pointer to a
structphylink_config. unsignedintmode- one of
MLO_AN_FIXED,MLO_AN_PHY,MLO_AN_INBAND. phy_interface_tiface- interface mode to switch to
Description
phylink will call this if it calledmac_prepare() to allow the MAC tocomplete any necessary steps after the MAC and PCS have been configuredfor themode andiface. E.g. a MAC driver may wish to re-enable theSerdes PHY here if it was previously disabled bymac_prepare().
Returns zero on success, or negative errno on failure which will bereported to the kernel log.
- void
mac_an_restart(structphylink_config * config)¶ restart 802.3z BaseX autonegotiation
Parameters
structphylink_config*config- a pointer to a
structphylink_config.
- void
mac_link_down(structphylink_config * config, unsigned int mode, phy_interface_t interface)¶ take the link down
Parameters
structphylink_config*config- a pointer to a
structphylink_config. unsignedintmode- link autonegotiation mode
phy_interface_tinterface- link
typedefphy_interface_tmode
Description
Ifmode is not an in-band negotiation mode (as defined byphylink_autoneg_inband()), force the link down and disable anyEnergy Efficient Ethernet MAC configuration. Interface typeselection must be done inmac_config().
- void
mac_link_up(structphylink_config * config, struct phy_device * phy, unsigned int mode, phy_interface_t interface, int speed, int duplex, bool tx_pause, bool rx_pause)¶ allow the link to come up
Parameters
structphylink_config*config- a pointer to a
structphylink_config. structphy_device*phy- any attached phy
unsignedintmode- link autonegotiation mode
phy_interface_tinterface- link
typedefphy_interface_tmode intspeed- link speed
intduplex- link duplex
booltx_pause- link transmit pause enablement status
boolrx_pause- link receive pause enablement status
Description
Configure the MAC for an established link.
speed,duplex,tx_pause andrx_pause indicate the finalised linksettings, and should be used to configure the MAC block appropriatelywhere these settings are not automatically conveyed from the PCS block,or if in-band negotiation (as defined by phylink_autoneg_inband(mode))is disabled.
Note that when 802.3z in-band negotiation is in use, it is possiblethat the user wishes to override the pause settings, and this shouldbe allowed when considering the implementation of this method.
If in-band negotiation mode is disabled, allow the link to come up. Ifphy is non-NULL, configure Energy Efficient Ethernet by callingphy_init_eee() and perform appropriate MAC configuration for EEE.Interface type selection must be done inmac_config().
- struct
phylink_pcs¶ PHYLINK PCS instance
Definition
struct phylink_pcs { const struct phylink_pcs_ops *ops; bool poll;};Members
ops- a pointer to the
structphylink_pcs_opsstructure poll- poll the PCS for link changes
Description
This structure is designed to be embedded within the PCS private data,and will be passed between phylink and the PCS.
- struct
phylink_pcs_ops¶ MAC PCS operations structure.
Definition
struct phylink_pcs_ops { void (*pcs_get_state)(struct phylink_pcs *pcs, struct phylink_link_state *state); int (*pcs_config)(struct phylink_pcs *pcs, unsigned int mode,phy_interface_t interface,const unsigned long *advertising, bool permit_pause_to_mac); void (*pcs_an_restart)(struct phylink_pcs *pcs); void (*pcs_link_up)(struct phylink_pcs *pcs, unsigned int mode, phy_interface_t interface, int speed, int duplex);};Members
pcs_get_state- read the current MAC PCS link state from the hardware.
pcs_config- configure the MAC PCS for the selected mode and state.
pcs_an_restart- restart 802.3z BaseX autonegotiation.
pcs_link_up- program the PCS for the resolved link configuration(where necessary).
- void
pcs_get_state(structphylink_pcs * pcs, structphylink_link_state * state)¶ Read the current inband link state from the hardware
Parameters
structphylink_pcs*pcs- a pointer to a
structphylink_pcs. structphylink_link_state*state- a pointer to a
structphylink_link_state.
Description
Read the current inband link state from the MAC PCS, reporting thecurrent speed instate->speed, duplex mode instate->duplex, pausemode instate->pause using theMLO_PAUSE_RX andMLO_PAUSE_TX bits,negotiation completion state instate->an_complete, and link up stateinstate->link. If possible,state->lp_advertising should also bepopulated.
When present, this overridesmac_pcs_get_state() instructphylink_mac_ops.
- int
pcs_config(structphylink_pcs * pcs, unsigned int mode, phy_interface_t interface, const unsigned long * advertising, bool permit_pause_to_mac)¶ Configure the PCS mode and advertisement
Parameters
structphylink_pcs*pcs- a pointer to a
structphylink_pcs. unsignedintmode- one of
MLO_AN_FIXED,MLO_AN_PHY,MLO_AN_INBAND. phy_interface_tinterface- interface mode to be used
constunsignedlong*advertising- adertisement ethtool link mode mask
boolpermit_pause_to_mac- permit forwarding pause resolution to MAC
Description
Configure the PCS for the operating mode, the interface mode, and setthe advertisement mask.permit_pause_to_mac indicates whether thehardware may forward the pause mode resolution to the MAC.
When operating inMLO_AN_INBAND, inband should always be enabled,otherwise inband should be disabled.
For SGMII, there is no advertisement from the MAC side, the PCS shouldbe programmed to acknowledge the inband word from the PHY.
For 1000BASE-X, the advertisement should be programmed into the PCS.
For most 10GBASE-R, there is no advertisement.
- void
pcs_an_restart(structphylink_pcs * pcs)¶ restart 802.3z BaseX autonegotiation
Parameters
structphylink_pcs*pcs- a pointer to a
structphylink_pcs.
Description
When PCS ops are present, this overridesmac_an_restart() instructphylink_mac_ops.
- void
pcs_link_up(structphylink_pcs * pcs, unsigned int mode, phy_interface_t interface, int speed, int duplex)¶ program the PCS for the resolved link configuration
Parameters
structphylink_pcs*pcs- a pointer to a
structphylink_pcs. unsignedintmode- link autonegotiation mode
phy_interface_tinterface- link
typedefphy_interface_tmode intspeed- link speed
intduplex- link duplex
Description
This call will be made just beforemac_link_up() to inform the PCS ofthe resolved link parameters. For example, a PCS operating in SGMIImode without in-band AN needs to be manually configured for the linkand duplex setting. Otherwise, this should be a no-op.
- struct
phylink¶ internal data type for phylink
Definition
struct phylink {};Members
- void
phylink_set_port_modes(unsigned long * mask)¶ set the port type modes in the ethtool mask
Parameters
unsignedlong*mask- ethtool link mode mask
Description
Sets all the port type modes in the ethtool mask. MAC drivers shoulduse this in their ‘validate’ callback.
- structphylink *
phylink_create(structphylink_config * config, struct fwnode_handle * fwnode, phy_interface_t iface, const structphylink_mac_ops * mac_ops)¶ create a phylink instance
Parameters
structphylink_config*config- a pointer to the target
structphylink_config structfwnode_handle*fwnode- a pointer to a
structfwnode_handledescribing the networkinterface phy_interface_tiface- the desired link mode defined by
typedefphy_interface_t conststructphylink_mac_ops*mac_ops- a pointer to a
structphylink_mac_opsfor the MAC.
Description
Create a new phylink instance, and parse the link parameters found innp.This will parse in-band modes, fixed-link or SFP configuration.
Returns a pointer to astructphylink, or an error-pointer value. Usersmust use IS_ERR() to check for errors from this function.
Note
the rtnl lock must not be held when calling this function.
- void
phylink_set_pcs(structphylink * pl, structphylink_pcs * pcs)¶ set the current PCS for phylink to use
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structphylink_pcs*pcs- a pointer to the
structphylink_pcs
Description
Bind the MAC PCS to phylink. This may be called afterphylink_create(),inmac_prepare() ormac_config() methods if it is desired to dynamicallychange the PCS.
Please note that there are behavioural changes with themac_config()callback if a PCS is present (denoting a newer setup) so removing a PCSis not supported, and if a PCS is going to be used, it must be registeredby callingphylink_set_pcs() at the latest in the firstmac_config() call.
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create()
Description
Destroy a phylink instance. Any PHY that has been attached must have beencleaned up viaphylink_disconnect_phy() prior to calling this function.
Note
the rtnl lock must not be held when calling this function.
- int
phylink_connect_phy(structphylink * pl, struct phy_device * phy)¶ connect a PHY to the phylink instance
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structphy_device*phy- a pointer to a
structphy_device.
Description
Connectphy to the phylink instance specified bypl by callingphy_attach_direct(). Configure thephy according to the MAC driver’scapabilities, start the PHYLIB state machine and enable any interruptsthat the PHY supports.
This updates the phylink’s ethtool supported and advertising link modemasks.
Returns 0 on success or a negative errno.
- int
phylink_of_phy_connect(structphylink * pl, struct device_node * dn, u32 flags)¶ connect the PHY specified in the DT mode.
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structdevice_node*dn- a pointer to a
structdevice_node. u32flags- PHY-specific flags to communicate to the PHY device driver
Description
Connect the phy specified in the device nodedn to the phylink instancespecified bypl. Actions specified inphylink_connect_phy() will beperformed.
Returns 0 on success or a negative errno.
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create()
Description
Disconnect any current PHY from the phylink instance described bypl.
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() boolup- indicates whether the link is currently up.
Description
The MAC driver should call this driver when the state of its linkchanges (eg, link failure, new negotiation results, etc.)
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create()
Description
Start the phylink instance specified bypl, configuring the MAC for thedesired link mode(s) and negotiation style. This should be called from thenetwork device driver’sstructnet_device_ops ndo_open() method.
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create()
Description
Stop the phylink instance specified bypl. This should be called from thenetwork device driver’sstructnet_device_ops ndo_stop() method. Thenetwork device’s carrier state should not be changed prior to calling thisfunction.
- void
phylink_ethtool_get_wol(structphylink * pl, struct ethtool_wolinfo * wol)¶ get the wake on lan parameters for the PHY
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structethtool_wolinfo*wol- a pointer to
structethtool_wolinfoto hold the read parameters
Description
Read the wake on lan parameters from the PHY attached to the phylinkinstance specified bypl. If no PHY is currently attached, report nosupport for wake on lan.
- int
phylink_ethtool_set_wol(structphylink * pl, struct ethtool_wolinfo * wol)¶ set wake on lan parameters
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structethtool_wolinfo*wol- a pointer to
structethtool_wolinfofor the desired parameters
Description
Set the wake on lan parameters for the PHY attached to the phylinkinstance specified bypl. If no PHY is attached, returnsEOPNOTSUPPerror.
Returns zero on success or negative errno code.
- int
phylink_ethtool_ksettings_get(structphylink * pl, struct ethtool_link_ksettings * kset)¶ get the current link settings
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structethtool_link_ksettings*kset- a pointer to a
structethtool_link_ksettingsto hold link settings
Description
Read the current link settings for the phylink instance specified bypl.This will be the link settings read from the MAC, PHY or fixed linksettings depending on the current negotiation mode.
- int
phylink_ethtool_ksettings_set(structphylink * pl, const struct ethtool_link_ksettings * kset)¶ set the link settings
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() conststructethtool_link_ksettings*kset- a pointer to a
structethtool_link_ksettingsfor the desired modes
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create()
Description
Restart negotiation for the phylink instance specified bypl. This willcause any attached phy to restart negotiation with the link partner, andif the MAC is in a BaseX mode, the MAC will also be requested to restartnegotiation.
Returns zero on success, or negative error code.
- void
phylink_ethtool_get_pauseparam(structphylink * pl, struct ethtool_pauseparam * pause)¶ get the current pause parameters
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structethtool_pauseparam*pause- a pointer to a
structethtool_pauseparam
- int
phylink_ethtool_set_pauseparam(structphylink * pl, struct ethtool_pauseparam * pause)¶ set the current pause parameters
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structethtool_pauseparam*pause- a pointer to a
structethtool_pauseparam
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create().
Description
Read the Energy Efficient Ethernet error counter from the PHY associatedwith the phylink instance specified bypl.
Returns positive error counter value, or negative error code.
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() boolclk_stop_enable- allow PHY to stop receive clock
Description
Must be called either with RTNL held or withinmac_link_up()
- int
phylink_ethtool_get_eee(structphylink * pl, struct ethtool_eee * eee)¶ read the energy efficient ethernet parameters
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structethtool_eee*eee- a pointer to a
structethtool_eeefor the read parameters
- int
phylink_ethtool_set_eee(structphylink * pl, struct ethtool_eee * eee)¶ set the energy efficient ethernet parameters
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structethtool_eee*eee- a pointer to a
structethtool_eeefor the desired parameters
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() structifreq*ifr- a pointer to a
structifreqfor socket ioctls intcmd- ioctl cmd to execute
Description
Perform the specified MII ioctl on the PHY attached to the phylink instancespecified bypl. If no PHY is attached, emulate the presence of the PHY.
SIOCGMIIPHY:- read register from the current PHY.
SIOCGMIIREG:- read register from the specified PHY.
SIOCSMIIREG:- set a register on the specified PHY.
Return
zero on success or negative error code.
- int
phylink_speed_down(structphylink * pl, bool sync)¶ set the non-SFP PHY to lowest speed supported by both link partners
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create() boolsync- perform action synchronously
Description
If we have a PHY that is not part of a SFP module, then set the speedas described in thephy_speed_down() function. Please see this functionfor a description of thesync parameter.
Returns zero if there is no PHY, otherwise as perphy_speed_down().
- int
phylink_speed_up(structphylink * pl)¶ restore the advertised speeds prior to the call to
phylink_speed_down()
Parameters
structphylink*pl- a pointer to a
structphylinkreturned fromphylink_create()
Description
If we have a PHY that is not part of a SFP module, then restore thePHY speeds as perphy_speed_up().
Returns zero if there is no PHY, otherwise as perphy_speed_up().
- void
phylink_helper_basex_speed(structphylink_link_state * state)¶ 1000BaseX/2500BaseX helper
Parameters
structphylink_link_state*state- a pointer to a
structphylink_link_state
Description
Inspect the interface mode, advertising mask or forced speed anddecide whether to run at 2.5Gbit or 1Gbit appropriately, switchingthe interface mode to suit.state->interface is appropriatelyupdated, and the advertising mask has the “other” baseX_Full flagcleared.
- void
phylink_mii_c22_pcs_get_state(struct mdio_device * pcs, structphylink_link_state * state)¶ read the MAC PCS state
Parameters
structmdio_device*pcs- a pointer to a
structmdio_device. structphylink_link_state*state- a pointer to a
structphylink_link_state.
Description
Helper for MAC PCS supporting the 802.3 clause 22 register set forclause 37 negotiation and/or SGMII control.
Read the MAC PCS state from the MII device configured inconfig andparse the Clause 37 or Cisco SGMII link partner negotiation word intothe phylinkstate structure. This is suitable to be directly pluggedinto themac_pcs_get_state() member of the struct phylink_mac_opsstructure.
- int
phylink_mii_c22_pcs_set_advertisement(struct mdio_device * pcs, phy_interface_t interface, const unsigned long * advertising)¶ configure the clause 37 PCS advertisement
Parameters
structmdio_device*pcs- a pointer to a
structmdio_device. phy_interface_tinterface- the PHY interface mode being configured
constunsignedlong*advertising- the ethtool advertisement mask
Description
Helper for MAC PCS supporting the 802.3 clause 22 register set forclause 37 negotiation and/or SGMII control.
Configure the clause 37 PCS advertisement as specified bystate. Thisdoes not trigger a renegotiation; phylink will do that via themac_an_restart() method of the struct phylink_mac_ops structure.
Returns negative error code on failure to configure the advertisement,zero if no change has been made, or one if the advertisement has changed.
- int
phylink_mii_c22_pcs_config(struct mdio_device * pcs, unsigned int mode, phy_interface_t interface, const unsigned long * advertising)¶ configure clause 22 PCS
Parameters
structmdio_device*pcs- a pointer to a
structmdio_device. unsignedintmode- link autonegotiation mode
phy_interface_tinterface- the PHY interface mode being configured
constunsignedlong*advertising- the ethtool advertisement mask
Description
Configure a Clause 22 PCS PHY with the appropriate negotiationparameters for themode,interface andadvertising parameters.Returns negative error number on failure, zero if the advertisementhas not changed, or positive if there is a change.
- void
phylink_mii_c22_pcs_an_restart(struct mdio_device * pcs)¶ restart 802.3z autonegotiation
Parameters
structmdio_device*pcs- a pointer to a
structmdio_device.
Description
Helper for MAC PCS supporting the 802.3 clause 22 register set forclause 37 negotiation.
Restart the clause 37 negotiation with the link partner. This issuitable to be directly plugged into themac_pcs_get_state() memberof the struct phylink_mac_ops structure.
SFP support¶
- struct
sfp_bus¶ internal representation of a sfp bus
Definition
struct sfp_bus {};Members
- struct
sfp_eeprom_id¶ raw SFP module identification information
Definition
struct sfp_eeprom_id { struct sfp_eeprom_base base; struct sfp_eeprom_ext ext;};Members
base- base SFP module identification structure
ext- extended SFP module identification structure
Description
See the SFF-8472 specification and related documents for the definitionof these structure members. This can be obtained fromhttps://www.snia.org/technology-communities/sff/specifications
- struct
sfp_upstream_ops¶ upstream operations structure
Definition
struct sfp_upstream_ops { void (*attach)(void *priv, struct sfp_bus *bus); void (*detach)(void *priv, struct sfp_bus *bus); int (*module_insert)(void *priv, const struct sfp_eeprom_id *id); void (*module_remove)(void *priv); int (*module_start)(void *priv); void (*module_stop)(void *priv); void (*link_down)(void *priv); void (*link_up)(void *priv); int (*connect_phy)(void *priv, struct phy_device *); void (*disconnect_phy)(void *priv);};Members
attach- called when the sfp socket driver is bound to the upstream(mandatory).
detach- called when the sfp socket driver is unbound from the upstream(mandatory).
module_insert- called after a module has been detected to determinewhether the module is supported for the upstream device.
module_remove- called after the module has been removed.
module_start- called after the PHY probe step
module_stop- called before the PHY is removed
link_down- called when the link is non-operational for whateverreason.
link_up- called when the link is operational.
connect_phy- called when an I2C accessible PHY has been detectedon the module.
disconnect_phy- called when a module with an I2C accessible PHY hasbeen removed.
- int
sfp_parse_port(structsfp_bus * bus, const structsfp_eeprom_id * id, unsigned long * support)¶ Parse the EEPROM base ID, setting the port type
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module conststructsfp_eeprom_id*id- a pointer to the module’s
structsfp_eeprom_id unsignedlong*support- optional pointer to an array of unsigned long for theethtool support mask
Description
Parse the EEPROM identification given inid, and return one ofPORT_TP,PORT_FIBRE orPORT_OTHER. Ifsupport is non-NULL,also set the ethtoolETHTOOL_LINK_MODE_xxx_BIT corresponding withthe connector type.
If the port type is not known, returnsPORT_OTHER.
- bool
sfp_may_have_phy(structsfp_bus * bus, const structsfp_eeprom_id * id)¶ indicate whether the module may have a PHY
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module conststructsfp_eeprom_id*id- a pointer to the module’s
structsfp_eeprom_id
Description
Parse the EEPROM identification given inid, and return whetherthis module may have a PHY.
- void
sfp_parse_support(structsfp_bus * bus, const structsfp_eeprom_id * id, unsigned long * support)¶ Parse the eeprom id for supported link modes
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module conststructsfp_eeprom_id*id- a pointer to the module’s
structsfp_eeprom_id unsignedlong*support- pointer to an array of unsigned long for the ethtool support mask
Description
Parse the EEPROM identification information and derive the supportedethtool link modes for the module.
- phy_interface_t
sfp_select_interface(structsfp_bus * bus, unsigned long * link_modes)¶ Select appropriate phy_interface_t mode
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module unsignedlong*link_modes- ethtool link modes mask
Description
Derive the phy_interface_t mode for the SFP module from the linkmodes mask.
- void
sfp_bus_put(structsfp_bus * bus)¶ put a reference on the
structsfp_bus
Parameters
structsfp_bus*bus- the
structsfp_busfound viasfp_bus_find_fwnode()
Description
Put a reference on thestructsfp_bus and free the underlying structureif this was the last reference.
- int
sfp_get_module_info(structsfp_bus * bus, struct ethtool_modinfo * modinfo)¶ Get the ethtool_modinfo for a SFP module
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module structethtool_modinfo*modinfo- a
structethtool_modinfo
Description
Fill in the type and eeprom_len parameters inmodinfo for a module onthe sfp bus specified bybus.
Returns 0 on success or a negative errno number.
- int
sfp_get_module_eeprom(structsfp_bus * bus, struct ethtool_eeprom * ee, u8 * data)¶ Read the SFP module EEPROM
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module structethtool_eeprom*ee- a
structethtool_eeprom u8*data- buffer to contain the EEPROM data (must be at leastee->len bytes)
Description
Read the EEPROM as specified by the suppliedee. See the documentationforstructethtool_eeprom for the region to be read.
Returns 0 on success or a negative errno number.
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module
Description
Inform the SFP socket that the network device is now up, so that themodule can be enabled by allowing TX_DISABLE to be deasserted. Thisshould be called from the network device driver’sstructnet_device_opsndo_open() method.
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module
Description
Inform the SFP socket that the network device is now up, so that themodule can be disabled by asserting TX_DISABLE, disabling the laserin optical modules. This should be called from the network devicedriver’sstructnet_device_ops ndo_stop() method.
- structsfp_bus *
sfp_bus_find_fwnode(struct fwnode_handle * fwnode)¶ parse and locate the SFP bus from fwnode
Parameters
structfwnode_handle*fwnode- firmware node for the parent device (MAC or PHY)
Description
Parse the parent device’s firmware node for a SFP bus, and locatethe sfp_bus structure, incrementing its reference count. This mustbe put viasfp_bus_put() when done.
Return
- on success, a pointer to the sfp_bus structure,
NULLif no SFP is specified,- on failure, an error pointer value:
- corresponding to the errors detailed forfwnode_property_get_reference_args().
-ENOMEMif we failed to allocate the bus.- an error from the upstream’s connect_phy() method.
- int
sfp_bus_add_upstream(structsfp_bus * bus, void * upstream, const structsfp_upstream_ops * ops)¶ parse and register the neighbouring device
Parameters
structsfp_bus*bus- the
structsfp_busfound viasfp_bus_find_fwnode() void*upstream- the upstream private data
conststructsfp_upstream_ops*ops- the upstream’s
structsfp_upstream_ops
Description
Add upstream driver for the SFP bus, and if the bus is complete, registerthe SFP bus using sfp_register_upstream(). This takes a reference on thebus, so it is safe to put the bus after this call.
Return
- on success, a pointer to the sfp_bus structure,
NULLif no SFP is specified,- on failure, an error pointer value:
- corresponding to the errors detailed forfwnode_property_get_reference_args().
-ENOMEMif we failed to allocate the bus.- an error from the upstream’s connect_phy() method.
Parameters
structsfp_bus*bus- a pointer to the
structsfp_busstructure for the sfp module
Description
Delete a previously registered upstream connection for the SFPmodule.bus should have been added bysfp_bus_add_upstream().