af89623ebaaebbccdccee6bfea72bb50008922ca
208 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
| b35c8682be |
Merge tag 'ASB-2021-12-05_4.19-stable' of https://android.googlesource.com/kernel/common
https://source.android.com/security/bulletin/2021-12-01 CVE-2021-33909 CVE-2021-38204 CVE-2021-0961 * tag 'ASB-2021-12-05_4.19-stable': (1065 commits) BACKPORT: arm64: vdso32: suppress error message for 'make mrproper' Linux 4.19.219 tty: hvc: replace BUG_ON() with negative return value xen/netfront: don't trust the backend response data blindly xen/netfront: disentangle tx_skb_freelist xen/netfront: don't read data from request on the ring page xen/netfront: read response from backend only once xen/blkfront: don't trust the backend response data blindly xen/blkfront: don't take local copy of a request from the ring page xen/blkfront: read response from backend only once xen: sync include/xen/interface/io/ring.h with Xen's newest version fuse: release pipe buf after last use NFC: add NCI_UNREG flag to eliminate the race hugetlbfs: flush TLBs correctly after huge_pmd_unshare s390/mm: validate VMA in PGSTE manipulation functions tracing: Check pid filtering when creating events vhost/vsock: fix incorrect used length reported to the guest net: hns3: fix VF RSS failed problem after PF enable multi-TCs net/smc: Don't call clcsock shutdown twice when smc shutdown MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48 ... Change-Id: Iaa72ffe6492c1a9a32cbd8769ae00c3f47ed198b Conflicts: arch/arm64/boot/dts/rockchip/rk3328.dtsi drivers/media/i2c/imx258.c drivers/soc/rockchip/Kconfig drivers/usb/host/ehci.h |
|||
| 47e51a7a22 |
Merge 4.19.218 into android-4.19-stable
Changes in 4.19.218
xhci: Fix USB 3.1 enumeration issues by increasing roothub power-on-good delay
binder: use euid from cred instead of using task
binder: use cred instead of task for selinux checks
Input: elantench - fix misreporting trackpoint coordinates
Input: i8042 - Add quirk for Fujitsu Lifebook T725
libata: fix read log timeout value
ocfs2: fix data corruption on truncate
mmc: dw_mmc: Dont wait for DRTO on Write RSP error
parisc: Fix ptrace check on syscall return
tpm: Check for integer overflow in tpm2_map_response_body()
firmware/psci: fix application of sizeof to pointer
crypto: s5p-sss - Add error handling in s5p_aes_probe()
media: ite-cir: IR receiver stop working after receive overflow
media: ir-kbd-i2c: improve responsiveness of hauppauge zilog receivers
ALSA: hda/realtek: Add quirk for Clevo PC70HS
ALSA: ua101: fix division by zero at probe
ALSA: 6fire: fix control and bulk message timeouts
ALSA: line6: fix control and interrupt message timeouts
ALSA: usb-audio: Add registration quirk for JBL Quantum 400
ALSA: synth: missing check for possible NULL after the call to kstrdup
ALSA: timer: Fix use-after-free problem
ALSA: timer: Unconditionally unlink slave instances, too
x86/sme: Use #define USE_EARLY_PGTABLE_L5 in mem_encrypt_identity.c
x86/irq: Ensure PI wakeup handler is unregistered before module unload
cavium: Return negative value when pci_alloc_irq_vectors() fails
scsi: qla2xxx: Fix unmap of already freed sgl
cavium: Fix return values of the probe function
sfc: Don't use netif_info before net_device setup
hyperv/vmbus: include linux/bitops.h
mmc: winbond: don't build on M68K
drm: panel-orientation-quirks: Add quirk for Aya Neo 2021
bpf: Prevent increasing bpf_jit_limit above max
xen/netfront: stop tx queues during live migration
spi: spl022: fix Microwire full duplex mode
watchdog: Fix OMAP watchdog early handling
vmxnet3: do not stop tx queues after netif_device_detach()
btrfs: clear MISSING device status bit in btrfs_close_one_device
btrfs: fix lost error handling when replaying directory deletes
btrfs: call btrfs_check_rw_degradable only if there is a missing device
ia64: kprobes: Fix to pass correct trampoline address to the handler
hwmon: (pmbus/lm25066) Add offset coefficients
regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled
regulator: dt-bindings: samsung,s5m8767: correct s5m8767,pmic-buck-default-dvs-idx property
EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell
mwifiex: fix division by zero in fw download path
ath6kl: fix division by zero in send path
ath6kl: fix control-message timeout
ath10k: fix control-message timeout
ath10k: fix division by zero in send path
PCI: Mark Atheros QCA6174 to avoid bus reset
rtl8187: fix control-message timeouts
evm: mark evm_fixmode as __ro_after_init
wcn36xx: Fix HT40 capability for 2Ghz band
mwifiex: Read a PCI register after writing the TX ring write pointer
libata: fix checking of DMA state
wcn36xx: handle connection loss indication
rsi: fix occasional initialisation failure with BT coex
rsi: fix key enabled check causing unwanted encryption for vap_id > 0
rsi: fix rate mask set leading to P2P failure
rsi: Fix module dev_oper_mode parameter description
RDMA/qedr: Fix NULL deref for query_qp on the GSI QP
signal: Remove the bogus sigkill_pending in ptrace_stop
signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT
power: supply: max17042_battery: Prevent int underflow in set_soc_threshold
power: supply: max17042_battery: use VFSOC for capacity when no rsns
powerpc/85xx: Fix oops when mpc85xx_smp_guts_ids node cannot be found
serial: core: Fix initializing and restoring termios speed
ALSA: mixer: oss: Fix racy access to slots
ALSA: mixer: fix deadlock in snd_mixer_oss_set_volume
xen/balloon: add late_initcall_sync() for initial ballooning done
PCI: aardvark: Do not clear status bits of masked interrupts
PCI: aardvark: Do not unmask unused interrupts
PCI: aardvark: Fix return value of MSI domain .alloc() method
PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG
quota: check block number when reading the block in quota file
quota: correct error number in free_dqentry()
pinctrl: core: fix possible memory leak in pinctrl_enable()
iio: dac: ad5446: Fix ad5622_write() return value
USB: serial: keyspan: fix memleak on probe errors
USB: iowarrior: fix control-message timeouts
drm: panel-orientation-quirks: Add quirk for KD Kurio Smart C15200 2-in-1
Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg()
Bluetooth: fix use-after-free error in lock_sock_nested()
platform/x86: wmi: do not fail if disabling fails
MIPS: lantiq: dma: add small delay after reset
MIPS: lantiq: dma: reset correct number of channel
locking/lockdep: Avoid RCU-induced noinstr fail
net: sched: update default qdisc visibility after Tx queue cnt changes
smackfs: Fix use-after-free in netlbl_catmap_walk()
x86: Increase exception stack sizes
mwifiex: Run SET_BSS_MODE when changing from P2P to STATION vif-type
mwifiex: Properly initialize private structure on interface type changes
media: mt9p031: Fix corrupted frame after restarting stream
media: netup_unidvb: handle interrupt properly according to the firmware
media: uvcvideo: Set capability in s_param
media: uvcvideo: Return -EIO for control errors
media: s5p-mfc: fix possible null-pointer dereference in s5p_mfc_probe()
media: s5p-mfc: Add checking to s5p_mfc_probe().
media: mceusb: return without resubmitting URB in case of -EPROTO error.
ia64: don't do IA64_CMPXCHG_DEBUG without CONFIG_PRINTK
media: rcar-csi2: Add checking to rcsi2_start_receiver()
ACPICA: Avoid evaluating methods too early during system resume
media: usb: dvd-usb: fix uninit-value bug in dibusb_read_eeprom_byte()
tracefs: Have tracefs directories not set OTH permission bits by default
ath: dfs_pattern_detector: Fix possible null-pointer dereference in channel_detector_create()
ACPI: battery: Accept charges over the design capacity as full
leaking_addresses: Always print a trailing newline
memstick: r592: Fix a UAF bug when removing the driver
lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression
lib/xz: Validate the value before assigning it to an enum variable
workqueue: make sysfs of unbound kworker cpumask more clever
tracing/cfi: Fix cmp_entries_* functions signature mismatch
mwl8k: Fix use-after-free in mwl8k_fw_state_machine()
PM: hibernate: Get block device exclusively in swsusp_check()
iwlwifi: mvm: disable RX-diversity in powersave
smackfs: use __GFP_NOFAIL for smk_cipso_doi()
ARM: clang: Do not rely on lr register for stacktrace
gre/sit: Don't generate link-local addr if addr_gen_mode is IN6_ADDR_GEN_MODE_NONE
ARM: 9136/1: ARMv7-M uses BE-8, not BE-32
spi: bcm-qspi: Fix missing clk_disable_unprepare() on error in bcm_qspi_probe()
x86/hyperv: Protect set_hv_tscchange_cb() against getting preempted
parisc: fix warning in flush_tlb_all
task_stack: Fix end_of_stack() for architectures with upwards-growing stack
parisc/unwind: fix unwinder when CONFIG_64BIT is enabled
parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling
Bluetooth: fix init and cleanup of sco_conn.timeout_work
cgroup: Make rebind_subsystems() disable v2 controllers all at once
net: dsa: rtl8366rb: Fix off-by-one bug
drm/amdgpu: fix warning for overflow check
media: em28xx: add missing em28xx_close_extension
media: dvb-usb: fix ununit-value in az6027_rc_query
media: mtk-vpu: Fix a resource leak in the error handling path of 'mtk_vpu_probe()'
media: si470x: Avoid card name truncation
media: cx23885: Fix snd_card_free call on null card pointer
cpuidle: Fix kobject memory leaks in error paths
media: em28xx: Don't use ops->suspend if it is NULL
ath9k: Fix potential interrupt storm on queue reset
media: dvb-frontends: mn88443x: Handle errors of clk_prepare_enable()
crypto: qat - detect PFVF collision after ACK
crypto: qat - disregard spurious PFVF interrupts
hwrng: mtk - Force runtime pm ops for sleep ops
b43legacy: fix a lower bounds test
b43: fix a lower bounds test
mmc: sdhci-omap: Fix NULL pointer exception if regulator is not configured
memstick: avoid out-of-range warning
memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host()
hwmon: Fix possible memleak in __hwmon_device_register()
hwmon: (pmbus/lm25066) Let compiler determine outer dimension of lm25066_coeff
ath10k: fix max antenna gain unit
drm/msm: uninitialized variable in msm_gem_import()
net: stream: don't purge sk_error_queue in sk_stream_kill_queues()
mmc: mxs-mmc: disable regulator on error and in the remove function
platform/x86: thinkpad_acpi: Fix bitwise vs. logical warning
rsi: stop thread firstly in rsi_91x_init() error handling
mwifiex: Send DELBA requests according to spec
phy: micrel: ksz8041nl: do not use power down mode
nvme-rdma: fix error code in nvme_rdma_setup_ctrl
PM: hibernate: fix sparse warnings
clocksource/drivers/timer-ti-dm: Select TIMER_OF
drm/msm: Fix potential NULL dereference in DPU SSPP
smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi
s390/gmap: don't unconditionally call pte_unmap_unlock() in __gmap_zap()
irq: mips: avoid nested irq_enter()
tcp: don't free a FIN sk_buff in tcp_remove_empty_skb()
samples/kretprobes: Fix return value if register_kretprobe() failed
KVM: s390: Fix handle_sske page fault handling
libertas_tf: Fix possible memory leak in probe and disconnect
libertas: Fix possible memory leak in probe and disconnect
wcn36xx: add proper DMA memory barriers in rx path
net: amd-xgbe: Toggle PLL settings during rate change
net: phylink: avoid mvneta warning when setting pause parameters
crypto: pcrypt - Delay write to padata->info
selftests/bpf: Fix fclose/pclose mismatch in test_progs
ibmvnic: Process crqs after enabling interrupts
RDMA/rxe: Fix wrong port_cap_flags
ARM: s3c: irq-s3c24xx: Fix return value check for s3c24xx_init_intc()
arm64: dts: rockchip: Fix GPU register width for RK3328
RDMA/bnxt_re: Fix query SRQ failure
ARM: dts: at91: tse850: the emac<->phy interface is rmii
scsi: dc395: Fix error case unwinding
MIPS: loongson64: make CPU_LOONGSON64 depends on MIPS_FP_SUPPORT
JFS: fix memleak in jfs_mount
ALSA: hda: Reduce udelay() at SKL+ position reporting
arm: dts: omap3-gta04a4: accelerometer irq fix
soc/tegra: Fix an error handling path in tegra_powergate_power_up()
memory: fsl_ifc: fix leak of irq and nand_irq in fsl_ifc_ctrl_probe
video: fbdev: chipsfb: use memset_io() instead of memset()
serial: 8250_dw: Drop wrong use of ACPI_PTR()
usb: gadget: hid: fix error code in do_config()
power: supply: rt5033_battery: Change voltage values to µV
scsi: csiostor: Uninitialized data in csio_ln_vnp_read_cbfn()
RDMA/mlx4: Return missed an error if device doesn't support steering
ASoC: cs42l42: Correct some register default values
ASoC: cs42l42: Defer probe if request_threaded_irq() returns EPROBE_DEFER
phy: qcom-qusb2: Fix a memory leak on probe
serial: xilinx_uartps: Fix race condition causing stuck TX
mips: cm: Convert to bitfield API to fix out-of-bounds access
power: supply: bq27xxx: Fix kernel crash on IRQ handler register error
apparmor: fix error check
rpmsg: Fix rpmsg_create_ept return when RPMSG config is not defined
pnfs/flexfiles: Fix misplaced barrier in nfs4_ff_layout_prepare_ds
drm/plane-helper: fix uninitialized variable reference
PCI: aardvark: Don't spam about PIO Response Status
NFS: Fix deadlocks in nfs_scan_commit_list()
fs: orangefs: fix error return code of orangefs_revalidate_lookup()
mtd: spi-nor: hisi-sfc: Remove excessive clk_disable_unprepare()
dmaengine: at_xdmac: fix AT_XDMAC_CC_PERID() macro
auxdisplay: img-ascii-lcd: Fix lock-up when displaying empty string
auxdisplay: ht16k33: Connect backlight to fbdev
auxdisplay: ht16k33: Fix frame buffer device blanking
netfilter: nfnetlink_queue: fix OOB when mac header was cleared
dmaengine: dmaengine_desc_callback_valid(): Check for `callback_result`
m68k: set a default value for MEMORY_RESERVE
watchdog: f71808e_wdt: fix inaccurate report in WDIOC_GETTIMEOUT
ar7: fix kernel builds for compiler test
scsi: qla2xxx: Fix gnl list corruption
scsi: qla2xxx: Turn off target reset during issue_lip
i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()'
xen-pciback: Fix return in pm_ctrl_init()
net: davinci_emac: Fix interrupt pacing disable
ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses
bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failed
mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
zram: off by one in read_block_state()
llc: fix out-of-bound array index in llc_sk_dev_hash()
nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails
arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
vsock: prevent unnecessary refcnt inc for nonblocking connect
cxgb4: fix eeprom len when diagnostics not implemented
USB: chipidea: fix interrupt deadlock
ARM: 9155/1: fix early early_iounmap()
ARM: 9156/1: drop cc-option fallbacks for architecture selection
f2fs: should use GFP_NOFS for directory inodes
9p/net: fix missing error check in p9_check_errors
powerpc/lib: Add helper to check if offset is within conditional branch range
powerpc/bpf: Validate branch ranges
powerpc/bpf: Fix BPF_SUB when imm == 0x80000000
powerpc/security: Add a helper to query stf_barrier type
powerpc/bpf: Emit stf barrier instruction sequences for BPF_NOSPEC
mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks
mm, oom: do not trigger out_of_memory from the #PF
backlight: gpio-backlight: Correct initial power state handling
video: backlight: Drop maximum brightness override for brightness zero
s390/cio: check the subchannel validity for dev_busid
s390/tape: fix timer initialization in tape_std_assign()
PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
fuse: truncate pagecache on atomic_o_trunc
x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
ext4: fix lazy initialization next schedule time computation in more granular unit
fortify: Explicitly disable Clang support
parisc/entry: fix trace test in syscall exit path
PCI/MSI: Destroy sysfs before freeing entries
PCI/MSI: Deal with devices lying about their MSI mask capability
PCI: Add MSI masking quirk for Nvidia ION AHCI
erofs: remove the occupied parameter from z_erofs_pagevec_enqueue()
erofs: fix unsafe pagevec reuse of hooked pclusters
arm64: zynqmp: Do not duplicate flash partition label property
arm64: zynqmp: Fix serial compatible string
scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq()
arm64: dts: hisilicon: fix arm,sp805 compatible string
usb: musb: tusb6010: check return value after calling platform_get_resource()
usb: typec: tipd: Remove WARN_ON in tps6598x_block_read
arm64: dts: freescale: fix arm,sp805 compatible string
ASoC: nau8824: Add DMI quirk mechanism for active-high jack-detect
scsi: advansys: Fix kernel pointer leak
firmware_loader: fix pre-allocated buf built-in firmware use
ARM: dts: omap: fix gpmc,mux-add-data type
usb: host: ohci-tmio: check return value after calling platform_get_resource()
ALSA: ISA: not for M68K
tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc
MIPS: sni: Fix the build
scsi: target: Fix ordered tag handling
scsi: target: Fix alua_tg_pt_gps_count tracking
powerpc/5200: dts: fix memory node unit name
ALSA: gus: fix null pointer dereference on pointer block
powerpc/dcr: Use cmplwi instead of 3-argument cmpli
sh: check return code of request_irq
maple: fix wrong return value of maple_bus_init().
f2fs: fix up f2fs_lookup tracepoints
sh: fix kconfig unmet dependency warning for FRAME_POINTER
sh: define __BIG_ENDIAN for math-emu
mips: BCM63XX: ensure that CPU_SUPPORTS_32BIT_KERNEL is set
sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
drm/nouveau: hdmigv100.c: fix corrupted HDMI Vendor InfoFrame
net: bnx2x: fix variable dereferenced before check
iavf: check for null in iavf_fix_features
iavf: Fix for the false positive ASQ/ARQ errors while issuing VF reset
MIPS: generic/yamon-dt: fix uninitialized variable error
mips: bcm63xx: add support for clk_get_parent()
mips: lantiq: add support for clk_get_parent()
platform/x86: hp_accel: Fix an error handling path in 'lis3lv02d_probe()'
net: virtio_net_hdr_to_skb: count transport header in UFO
i40e: Fix correct max_pkt_size on VF RX queue
i40e: Fix NULL ptr dereference on VSI filter sync
i40e: Fix changing previously set num_queue_pairs for PFs
i40e: Fix display error code in dmesg
NFC: reorganize the functions in nci_request
NFC: reorder the logic in nfc_{un,}register_device
perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server
perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
tun: fix bonding active backup with arp monitoring
hexagon: export raw I/O routines for modules
ipc: WARN if trying to remove ipc object which is absent
mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup fails
udf: Fix crash after seekdir
btrfs: fix memory ordering between normal and ordered work functions
parisc/sticon: fix reverse colors
cfg80211: call cfg80211_stop_ap when switch from P2P_GO type
drm/udl: fix control-message timeout
drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors
perf/core: Avoid put_page() when GUP fails
batman-adv: mcast: fix duplicate mcast packets in BLA backbone from LAN
batman-adv: Consider fragmentation for needed_headroom
batman-adv: Reserve needed_*room for fragments
batman-adv: Don't always reallocate the fragmentation skb head
RDMA/netlink: Add __maybe_unused to static inline in C file
ASoC: DAPM: Cover regression by kctl change notification fix
usb: max-3421: Use driver data instead of maintaining a list of bound devices
soc/tegra: pmc: Fix imbalanced clock disabling in error code path
Linux 4.19.218
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3f87fc92fe2a7a19ddddb522916f74dba7929583
|
|||
| 5c6fb0e0c7 |
bpf: Prevent increasing bpf_jit_limit above max
[ Upstream commit fadb7ff1a6c2c565af56b4aacdd086b067eed440 ] Restrict bpf_jit_limit to the maximum supported by the arch's JIT. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211014142554.53120-4-lmb@cloudflare.com Signed-off-by: Sasha Levin <sashal@kernel.org> |
|||
| ddc7145079 |
Merge tag 'ASB-2021-10-05_4.19-stable' of https://android.googlesource.com/kernel/common
https://source.android.com/security/bulletin/2021-10-01 CVE-2020-29368 CVE-2020-29660 CVE-2021-0707 CVE-2020-10768 CVE-2021-29647 * tag 'ASB-2021-10-05_4.19-stable': (1087 commits) Linux 4.19.206 net: don't unconditionally copy_from_user a struct ifreq for socket ioctls Revert "floppy: reintroduce O_NDELAY fix" KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs fbmem: add margin check to fb_check_caps() vt_kdsetmode: extend console locking net/rds: dma_map_sg is entitled to merge entries drm/nouveau/disp: power down unused DP links during init drm: Copy drm_wait_vblank to user before returning qed: Fix null-pointer dereference in qed_rdma_create_qp() qed: qed ll2 race condition fixes vringh: Use wiov->used to check for read/write desc order virtio_pci: Support surprise removal of virtio pci device virtio: Improve vq->broken access to avoid any compiler optimization opp: remove WARN when no valid OPPs remain usb: gadget: u_audio: fix race condition on endpoint stop net: hns3: fix get wrong pfc_en when query PFC configuration net: marvell: fix MVNETA_TX_IN_PRGRS bit number xgene-v2: Fix a resource leak in the error handling path of 'xge_probe()' ip_gre: add validation for csum_start ... Change-Id: Ib7da21059ab763a87255147340e4cba92e6b3f61 Conflicts: arch/arm/boot/dts/rk322x.dtsi arch/arm/boot/dts/rk3288.dtsi drivers/dma/pl330.c drivers/regulator/core.c drivers/staging/android/ion/ion.c drivers/usb/dwc3/core.c drivers/usb/dwc3/ep0.c drivers/usb/gadget/function/u_audio.c |
|||
| 7dee443c55 |
net: core: Add filter config option
Add this config option to reduce memory usage, if it was not necessary. ./ksize.sh net before size: 716596 Bytes after size: 673095 Bytes save size: 43501 Bytes Signed-off-by: David Wu <david.wu@rock-chips.com> Change-Id: I012d926a7a8be9a4897a85fbf47f7fbbab1e43c4 |
|||
| 11156bde8d |
Merge 4.19.207 into android-4.19-stable
Changes in 4.19.207 ext4: fix race writing to an inline_data file while its xattrs are changing xtensa: fix kconfig unmet dependency warning for HAVE_FUTEX_CMPXCHG gpu: ipu-v3: Fix i.MX IPU-v3 offset calculations for (semi)planar U/V formats qed: Fix the VF msix vectors flow net: macb: Add a NULL check on desc_ptp qede: Fix memset corruption perf/x86/intel/pt: Fix mask of num_address_ranges perf/x86/amd/ibs: Work around erratum #1197 cryptoloop: add a deprecation warning ARM: 8918/2: only build return_address() if needed ALSA: pcm: fix divide error in snd_pcm_lib_ioctl clk: fix build warning for orphan_list media: stkwebcam: fix memory leak in stk_camera_probe ARM: imx: add missing clk_disable_unprepare() ARM: imx: fix missing 3rd argument in macro imx_mmdc_perf_init igmp: Add ip_mc_list lock in ip_check_mc_rcu USB: serial: mos7720: improve OOM-handling in read_mos_reg() ipv4/icmp: l3mdev: Perform icmp error route lookup on source device routing table (v2) SUNRPC/nfs: Fix return value for nfs4_callback_compound() crypto: talitos - reduce max key size for SEC1 powerpc/module64: Fix comment in R_PPC64_ENTRY handling powerpc/boot: Delete unneeded .globl _zimage_start net: ll_temac: Remove left-over debug message mm/page_alloc: speed up the iteration of max_order Revert "btrfs: compression: don't try to compress if we don't have enough pages" ALSA: usb-audio: Add registration quirk for JBL Quantum 800 usb: host: xhci-rcar: Don't reload firmware after the completion usb: mtu3: use @mult for HS isoc or intr usb: mtu3: fix the wrong HS mult value x86/reboot: Limit Dell Optiplex 990 quirk to early BIOS versions PCI: Call Max Payload Size-related fixup quirks early locking/mutex: Fix HANDOFF condition regmap: fix the offset of register error log crypto: mxs-dcp - Check for DMA mapping errors sched/deadline: Fix reset_on_fork reporting of DL tasks power: supply: axp288_fuel_gauge: Report register-address on readb / writeb errors crypto: omap-sham - clear dma flags only after omap_sham_update_dma_stop() sched/deadline: Fix missing clock update in migrate_task_rq_dl() hrtimer: Avoid double reprogramming in __hrtimer_start_range_ns() udf: Check LVID earlier isofs: joliet: Fix iocharset=utf8 mount option bcache: add proper error unwinding in bcache_device_init nvme-rdma: don't update queue count when failing to set io queues power: supply: max17042_battery: fix typo in MAx17042_TOFF s390/cio: add dev_busid sysfs entry for each subchannel libata: fix ata_host_start() crypto: qat - do not ignore errors from enable_vf2pf_comms() crypto: qat - handle both source of interrupt in VF ISR crypto: qat - fix reuse of completion variable crypto: qat - fix naming for init/shutdown VF to PF notifications crypto: qat - do not export adf_iov_putmsg() fcntl: fix potential deadlock for &fasync_struct.fa_lock udf_get_extendedattr() had no boundary checks. m68k: emu: Fix invalid free in nfeth_cleanup() spi: spi-fsl-dspi: Fix issue with uninitialized dma_slave_config spi: spi-pic32: Fix issue with uninitialized dma_slave_config lib/mpi: use kcalloc in mpi_resize clocksource/drivers/sh_cmt: Fix wrong setting if don't request IRQ for clock source channel crypto: qat - use proper type for vf_mask certs: Trigger creation of RSA module signing key if it's not an RSA key spi: sprd: Fix the wrong WDG_LOAD_VAL media: TDA1997x: enable EDID support soc: rockchip: ROCKCHIP_GRF should not default to y, unconditionally media: dvb-usb: fix uninit-value in dvb_usb_adapter_dvb_init media: dvb-usb: fix uninit-value in vp702x_read_mac_addr media: go7007: remove redundant initialization Bluetooth: sco: prevent information leak in sco_conn_defer_accept() tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos net: cipso: fix warnings in netlbl_cipsov4_add_std i2c: highlander: add IRQ check media: em28xx-input: fix refcount bug in em28xx_usb_disconnect media: venus: venc: Fix potential null pointer dereference on pointer fmt PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently PCI: PM: Enable PME if it can be signaled from D3cold soc: qcom: smsm: Fix missed interrupts if state changes while masked Bluetooth: increase BTNAMSIZ to 21 chars to fix potential buffer overflow drm/msm/dpu: make dpu_hw_ctl_clear_all_blendstages clear necessary LMs arm64: dts: exynos: correct GIC CPU interfaces address range on Exynos7 Bluetooth: fix repeated calls to sco_sock_kill drm/msm/dsi: Fix some reference counted resource leaks usb: gadget: udc: at91: add IRQ check usb: phy: fsl-usb: add IRQ check usb: phy: twl6030: add IRQ checks Bluetooth: Move shutdown callback before flushing tx and rx queue usb: host: ohci-tmio: add IRQ check usb: phy: tahvo: add IRQ check mac80211: Fix insufficient headroom issue for AMSDU usb: gadget: mv_u3d: request_irq() after initializing UDC Bluetooth: add timeout sanity check to hci_inquiry i2c: iop3xx: fix deferred probing i2c: s3c2410: fix IRQ check mmc: dw_mmc: Fix issue with uninitialized dma_slave_config mmc: moxart: Fix issue with uninitialized dma_slave_config CIFS: Fix a potencially linear read overflow i2c: mt65xx: fix IRQ check usb: ehci-orion: Handle errors of clk_prepare_enable() in probe usb: bdc: Fix an error handling path in 'bdc_probe()' when no suitable DMA config is available tty: serial: fsl_lpuart: fix the wrong mapbase value ath6kl: wmi: fix an error code in ath6kl_wmi_sync_point() bcma: Fix memory leak for internally-handled cores ipv4: make exception cache less predictible net: sched: Fix qdisc_rate_table refcount leak when get tcf_block failed net: qualcomm: fix QCA7000 checksum handling ipv4: fix endianness issue in inet_rtm_getroute_build_skb() netns: protect netns ID lookups with RCU fscrypt: add fscrypt_symlink_getattr() for computing st_size ext4: report correct st_size for encrypted symlinks f2fs: report correct st_size for encrypted symlinks ubifs: report correct st_size for encrypted symlinks tty: Fix data race between tiocsti() and flush_to_ldisc() x86/resctrl: Fix a maybe-uninitialized build warning treated as error KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted IMA: remove -Wmissing-prototypes warning IMA: remove the dependency on CRYPTO_MD5 fbmem: don't allow too huge resolutions backlight: pwm_bl: Improve bootloader/kernel device handover clk: kirkwood: Fix a clocking boot regression rtc: tps65910: Correct driver module alias btrfs: reset replace target device to allocation state on close blk-zoned: allow zone management send operations without CAP_SYS_ADMIN blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN PCI/MSI: Skip masking MSI-X on Xen PV powerpc/perf/hv-gpci: Fix counter value parsing xen: fix setting of max_pfn in shared_info include/linux/list.h: add a macro to test if entry is pointing to the head 9p/xen: Fix end of loop tests for list_for_each_entry bpf/verifier: per-register parent pointers bpf: correct slot_type marking logic to allow more stack slot sharing bpf: Support variable offset stack access from helpers bpf: Reject indirect var_off stack access in raw mode bpf: Reject indirect var_off stack access in unpriv mode bpf: Sanity check max value for var_off stack access selftests/bpf: Test variable offset stack access bpf: track spill/fill of constants selftests/bpf: fix tests due to const spill/fill bpf: Introduce BPF nospec instruction for mitigating Spectre v4 bpf: Fix leakage due to insufficient speculative store bypass mitigation bpf: verifier: Allocate idmap scratch in verifier env bpf: Fix pointer arithmetic mask tightening under state pruning tools/thermal/tmon: Add cross compiling support soc: aspeed: lpc-ctrl: Fix boundary check for mmap arm64: head: avoid over-mapping in map_memory crypto: public_key: fix overflow during implicit conversion block: bfq: fix bfq_set_next_ioprio_data() power: supply: max17042: handle fails of reading status register dm crypt: Avoid percpu_counter spinlock contention in crypt_page_alloc() VMCI: fix NULL pointer dereference when unmapping queue pair media: uvc: don't do DMA on stack media: rc-loopback: return number of emitters rather than error libata: add ATA_HORKAGE_NO_NCQ_TRIM for Samsung 860 and 870 SSDs ARM: 9105/1: atags_to_fdt: don't warn about stack size PCI: Restrict ASMedia ASM1062 SATA Max Payload Size Supported PCI: Return ~0 data on pciconfig_read() CAP_SYS_ADMIN failure PCI: xilinx-nwl: Enable the clock through CCF PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO response PCI: aardvark: Fix masking and unmasking legacy INTx interrupts HID: input: do not report stylus battery state as "full" RDMA/iwcm: Release resources if iw_cm module initialization fails docs: Fix infiniband uverbs minor number pinctrl: samsung: Fix pinctrl bank pin count vfio: Use config not menuconfig for VFIO_NOIOMMU powerpc/stacktrace: Include linux/delay.h openrisc: don't printk() unconditionally pinctrl: single: Fix error return code in pcs_parse_bits_in_pinctrl_entry() scsi: qedi: Fix error codes in qedi_alloc_global_queues() platform/x86: dell-smbios-wmi: Add missing kfree in error-exit from run_smbios_call fscache: Fix cookie key hashing f2fs: fix to account missing .skipped_gc_rwsem f2fs: fix to unmap pages from userspace process in punch_hole() MIPS: Malta: fix alignment of the devicetree buffer userfaultfd: prevent concurrent API initialization media: dib8000: rewrite the init prbs logic crypto: mxs-dcp - Use sg_mapping_iter to copy data PCI: Use pci_update_current_state() in pci_enable_device_flags() tipc: keep the skb in rcv queue until the whole data is read iio: dac: ad5624r: Fix incorrect handling of an optional regulator. ARM: dts: qcom: apq8064: correct clock names video: fbdev: kyro: fix a DoS bug by restricting user input netlink: Deal with ESRCH error in nlmsg_notify() Smack: Fix wrong semantics in smk_access_entry() usb: host: fotg210: fix the endpoint's transactional opportunities calculation usb: host: fotg210: fix the actual_length of an iso packet usb: gadget: u_ether: fix a potential null pointer dereference usb: gadget: composite: Allow bMaxPower=0 if self-powered staging: board: Fix uninitialized spinlock when attaching genpd tty: serial: jsm: hold port lock when reporting modem line changes drm/amd/amdgpu: Update debugfs link_settings output link_rate field in hex bpf/tests: Fix copy-and-paste error in double word test bpf/tests: Do not PASS tests without actually testing the result video: fbdev: asiliantfb: Error out if 'pixclock' equals zero video: fbdev: kyro: Error out if 'pixclock' equals zero video: fbdev: riva: Error out if 'pixclock' equals zero ipv4: ip_output.c: Fix out-of-bounds warning in ip_copy_addrs() flow_dissector: Fix out-of-bounds warnings s390/jump_label: print real address in a case of a jump label bug serial: 8250: Define RX trigger levels for OxSemi 950 devices xtensa: ISS: don't panic in rs_init hvsi: don't panic on tty_register_driver failure serial: 8250_pci: make setup_port() parameters explicitly unsigned staging: ks7010: Fix the initialization of the 'sleep_status' structure samples: bpf: Fix tracex7 error raised on the missing argument ata: sata_dwc_460ex: No need to call phy_exit() befre phy_init() Bluetooth: skip invalid hci_sync_conn_complete_evt bonding: 3ad: fix the concurrency between __bond_release_one() and bond_3ad_state_machine_handler() ASoC: Intel: bytcr_rt5640: Move "Platform Clock" routes to the maps for the matching in-/output media: imx258: Rectify mismatch of VTS value media: imx258: Limit the max analogue gain to 480 media: v4l2-dv-timings.c: fix wrong condition in two for-loops media: TDA1997x: fix tda1997x_query_dv_timings() return value media: tegra-cec: Handle errors of clk_prepare_enable() ARM: dts: imx53-ppd: Fix ACHC entry arm64: dts: qcom: sdm660: use reg value for memory node net: ethernet: stmmac: Do not use unreachable() in ipq806x_gmac_probe() Bluetooth: schedule SCO timeouts with delayed_work Bluetooth: avoid circular locks in sco_sock_connect gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port() ARM: tegra: tamonten: Fix UART pad setting Bluetooth: Fix handling of LE Enhanced Connection Complete serial: sh-sci: fix break handling for sysrq tcp: enable data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD rpc: fix gss_svc_init cleanup on failure staging: rts5208: Fix get_ms_information() heap buffer size gfs2: Don't call dlm after protocol is unmounted of: Don't allow __of_attached_node_sysfs() without CONFIG_SYSFS mmc: sdhci-of-arasan: Check return value of non-void funtions mmc: rtsx_pci: Fix long reads when clock is prescaled selftests/bpf: Enlarge select() timeout for test_maps mmc: core: Return correct emmc response in case of ioctl error cifs: fix wrong release in sess_alloc_buffer() failed path Revert "USB: xhci: fix U1/U2 handling for hardware with XHCI_INTEL_HOST quirk set" usb: musb: musb_dsps: request_irq() after initializing musb usbip: give back URBs for unsent unlink requests during cleanup usbip:vhci_hcd USB port can get stuck in the disabled state ASoC: rockchip: i2s: Fix regmap_ops hang ASoC: rockchip: i2s: Fixup config for DAIFMT_DSP_A/B parport: remove non-zero check on count ath9k: fix OOB read ar9300_eeprom_restore_internal ath9k: fix sleeping in atomic context net: fix NULL pointer reference in cipso_v4_doi_free net: w5100: check return value after calling platform_get_resource() parisc: fix crash with signals and alloca ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup() scsi: BusLogic: Fix missing pr_cont() use scsi: qla2xxx: Sync queue idx with queue_pair_map idx cpufreq: powernv: Fix init_chip_info initialization in numa=off mm/hugetlb: initialize hugetlb_usage in mm_init memcg: enable accounting for pids in nested pid namespaces platform/chrome: cros_ec_proto: Send command again when timeout occurs drm/amdgpu: Fix BUG_ON assert dm thin metadata: Fix use-after-free in dm_bm_set_read_only xen: reset legacy rtc flag for PV domU bnx2x: Fix enabling network interfaces without VFs arm64/sve: Use correct size when reinitialising SVE state PM: base: power: don't try to use non-existing RTC for storing data PCI: Add AMD GPU multi-function power dependencies x86/mm: Fix kern_addr_valid() to cope with existing but not present entries tipc: fix an use-after-free issue in tipc_recvmsg net-caif: avoid user-triggerable WARN_ON(1) ptp: dp83640: don't define PAGE0 dccp: don't duplicate ccid when cloning dccp sock net/l2tp: Fix reference count leak in l2tp_udp_recv_core r6040: Restore MDIO clock frequency after MAC reset tipc: increase timeout in tipc_sk_enqueue() perf machine: Initialize srcline string member in add_location struct net/mlx5: Fix potential sleeping in atomic context events: Reuse value read using READ_ONCE instead of re-reading it net/af_unix: fix a data-race in unix_dgram_poll net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup tcp: fix tp->undo_retrans accounting in tcp_sacktag_one() qed: Handle management FW error ibmvnic: check failover_pending in login response net: hns3: pad the short tunnel frame before sending to hardware mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range() KVM: s390: index kvm->arch.idle_mask by vcpu_idx dt-bindings: mtd: gpmc: Fix the ECC bytes vs. OOB bytes equation mfd: Don't use irq_create_mapping() to resolve a mapping PCI: Add ACS quirks for Cavium multi-function devices net: usb: cdc_mbim: avoid altsetting toggling for Telit LN920 block, bfq: honor already-setup queue merges ethtool: Fix an error code in cxgb2.c NTB: perf: Fix an error code in perf_setup_inbuf() mfd: axp20x: Update AXP288 volatile ranges PCI: Fix pci_dev_str_match_path() alloc while atomic bug KVM: arm64: Handle PSCI resets before userspace touches vCPU state PCI: Sync __pci_register_driver() stub for CONFIG_PCI=n mtd: rawnand: cafe: Fix a resource leak in the error handling path of 'cafe_nand_probe()' ARC: export clear_user_page() for modules net: dsa: b53: Fix calculating number of switch ports netfilter: socket: icmp6: fix use-after-scope fq_codel: reject silly quantum parameters qlcnic: Remove redundant unlock in qlcnic_pinit_from_rom ip_gre: validate csum_start only on pull net: renesas: sh_eth: Fix freeing wrong tx descriptor s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant Linux 4.19.207 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I18108cb47ba9e95838ebe55aaabe34de345ee846 |
|||
| 91cdb5b362 |
bpf: Introduce BPF nospec instruction for mitigating Spectre v4
commit f5e81d1117501546b7be050c5fbafa6efd2c722c upstream. In case of JITs, each of the JIT backends compiles the BPF nospec instruction /either/ to a machine instruction which emits a speculation barrier /or/ to /no/ machine instruction in case the underlying architecture is not affected by Speculative Store Bypass or has different mitigations in place already. This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence' instruction for mitigation. In case of arm64, we rely on the firmware mitigation as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled, it works for all of the kernel code with no need to provide any additional instructions here (hence only comment in arm64 JIT). Other archs can follow as needed. The BPF nospec instruction is specifically targeting Spectre v4 since i) we don't use a serialization barrier for the Spectre v1 case, and ii) mitigation instructions for v1 and v4 might be different on some archs. The BPF nospec is required for a future commit, where the BPF verifier does annotate intermediate BPF programs with speculation barriers. Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org> [OP: adjusted context for 4.19, drop riscv and ppc32 changes] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
|||
| a6850bb536 |
Merge 4.19.206 into android-4.19-stable
Changes in 4.19.206 net: qrtr: fix another OOB Read in qrtr_endpoint_post bpf: Do not use ax register in interpreter on div/mod bpf: Fix 32 bit src register truncation on div/mod bpf: Fix truncation handling for mod32 dst reg wrt zero ARC: Fix CONFIG_STACKDEPOT netfilter: conntrack: collect all entries in one cycle once: Fix panic when module unload can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters Revert "USB: serial: ch341: fix character loss at high transfer rates" USB: serial: option: add new VID/PID to support Fibocom FG150 usb: dwc3: gadget: Fix dwc3_calc_trbs_left() usb: dwc3: gadget: Stop EP0 transfers during pullup disable IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs() e1000e: Fix the max snoop/no-snoop latency for 10M ip_gre: add validation for csum_start xgene-v2: Fix a resource leak in the error handling path of 'xge_probe()' net: marvell: fix MVNETA_TX_IN_PRGRS bit number net: hns3: fix get wrong pfc_en when query PFC configuration usb: gadget: u_audio: fix race condition on endpoint stop opp: remove WARN when no valid OPPs remain virtio: Improve vq->broken access to avoid any compiler optimization virtio_pci: Support surprise removal of virtio pci device vringh: Use wiov->used to check for read/write desc order qed: qed ll2 race condition fixes qed: Fix null-pointer dereference in qed_rdma_create_qp() drm: Copy drm_wait_vblank to user before returning drm/nouveau/disp: power down unused DP links during init net/rds: dma_map_sg is entitled to merge entries vt_kdsetmode: extend console locking fbmem: add margin check to fb_check_caps() KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs Revert "floppy: reintroduce O_NDELAY fix" net: don't unconditionally copy_from_user a struct ifreq for socket ioctls Linux 4.19.206 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I04e05680c5e311bc4cd79daae49d654b66f774a0 |
|||
| 8313432df2 |
bpf: Fix 32 bit src register truncation on div/mod
Commit e88b2c6e5a4d9ce30d75391e4d950da74bb2bd90 upstream.
While reviewing a different fix, John and I noticed an oddity in one of the
BPF program dumps that stood out, for example:
# bpftool p d x i 13
0: (b7) r0 = 808464450
1: (b4) w4 = 808464432
2: (bc) w0 = w0
3: (15) if r0 == 0x0 goto pc+1
4: (9c) w4 %= w0
[...]
In line 2 we noticed that the mov32 would 32 bit truncate the original src
register for the div/mod operation. While for the two operations the dst
register is typically marked unknown e.g. from adjust_scalar_min_max_vals()
the src register is not, and thus verifier keeps tracking original bounds,
simplified:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (b7) r0 = -1
1: R0_w=invP-1 R1=ctx(id=0,off=0,imm=0) R10=fp0
1: (b7) r1 = -1
2: R0_w=invP-1 R1_w=invP-1 R10=fp0
2: (3c) w0 /= w1
3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP-1 R10=fp0
3: (77) r1 >>= 32
4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP4294967295 R10=fp0
4: (bf) r0 = r1
5: R0_w=invP4294967295 R1_w=invP4294967295 R10=fp0
5: (95) exit
processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
Runtime result of r0 at exit is 0 instead of expected -1. Remove the
verifier mov32 src rewrite in div/mod and replace it with a jmp32 test
instead. After the fix, we result in the following code generation when
having dividend r1 and divisor r6:
div, 64 bit: div, 32 bit:
0: (b7) r6 = 8 0: (b7) r6 = 8
1: (b7) r1 = 8 1: (b7) r1 = 8
2: (55) if r6 != 0x0 goto pc+2 2: (56) if w6 != 0x0 goto pc+2
3: (ac) w1 ^= w1 3: (ac) w1 ^= w1
4: (05) goto pc+1 4: (05) goto pc+1
5: (3f) r1 /= r6 5: (3c) w1 /= w6
6: (b7) r0 = 0 6: (b7) r0 = 0
7: (95) exit 7: (95) exit
mod, 64 bit: mod, 32 bit:
0: (b7) r6 = 8 0: (b7) r6 = 8
1: (b7) r1 = 8 1: (b7) r1 = 8
2: (15) if r6 == 0x0 goto pc+1 2: (16) if w6 == 0x0 goto pc+1
3: (9f) r1 %= r6 3: (9c) w1 %= w6
4: (b7) r0 = 0 4: (b7) r0 = 0
5: (95) exit 5: (95) exit
x86 in particular can throw a 'divide error' exception for div
instruction not only for divisor being zero, but also for the case
when the quotient is too large for the designated register. For the
edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT
since we always zero edx (rdx). Hence really the only protection
needed is against divisor being zero.
Fixes:
|
|||
| 6455a150fa |
Merge 4.19.178 into android-4.19-stable
Changes in 4.19.178
HID: make arrays usage and value to be the same
USB: quirks: sort quirk entries
usb: quirks: add quirk to start video capture on ELMO L-12F document camera reliable
ntfs: check for valid standard information attribute
arm64: tegra: Add power-domain for Tegra210 HDA
scripts: use pkg-config to locate libcrypto
scripts: set proper OpenSSL include dir also for sign-file
block: add helper for checking if queue is registered
block: split .sysfs_lock into two locks
block: fix race between switching elevator and removing queues
block: don't release queue's sysfs lock during switching elevator
NET: usb: qmi_wwan: Adding support for Cinterion MV31
cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.
scripts/recordmcount.pl: support big endian for ARCH sh
jump_label/lockdep: Assert we hold the hotplug lock for _cpuslocked() operations
locking/static_key: Fix false positive warnings on concurrent dec/inc
vmlinux.lds.h: add DWARF v5 sections
kdb: Make memory allocations more robust
PCI: qcom: Use PHY_REFCLK_USE_PAD only for ipq8064
bfq: Avoid false bfq queue merging
ALSA: usb-audio: Fix PCM buffer allocation in non-vmalloc mode
MIPS: vmlinux.lds.S: add missing PAGE_ALIGNED_DATA() section
random: fix the RNDRESEEDCRNG ioctl
ath10k: Fix error handling in case of CE pipe init failure
Bluetooth: btqcomsmd: Fix a resource leak in error handling paths in the probe function
Bluetooth: Fix initializing response id after clearing struct
ARM: dts: exynos: correct PMIC interrupt trigger level on Artik 5
ARM: dts: exynos: correct PMIC interrupt trigger level on Monk
ARM: dts: exynos: correct PMIC interrupt trigger level on Rinato
ARM: dts: exynos: correct PMIC interrupt trigger level on Spring
ARM: dts: exynos: correct PMIC interrupt trigger level on Arndale Octa
ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid XU3 family
arm64: dts: exynos: correct PMIC interrupt trigger level on TM2
arm64: dts: exynos: correct PMIC interrupt trigger level on Espresso
bpf: Avoid warning when re-casting __bpf_call_base into __bpf_call_base_args
arm64: dts: allwinner: A64: properly connect USB PHY to port 0
arm64: dts: allwinner: Drop non-removable from SoPine/LTS SD card
arm64: dts: allwinner: A64: Limit MMC2 bus frequency to 150 MHz
cpufreq: brcmstb-avs-cpufreq: Free resources in error path
cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove()
ACPICA: Fix exception code class checks
usb: gadget: u_audio: Free requests only after callback
Bluetooth: drop HCI device reference before return
Bluetooth: Put HCI device if inquiry procedure interrupts
memory: ti-aemif: Drop child node when jumping out loop
ARM: dts: Configure missing thermal interrupt for 4430
usb: dwc2: Do not update data length if it is 0 on inbound transfers
usb: dwc2: Abort transaction after errors with unknown reason
usb: dwc2: Make "trimming xfer length" a debug message
staging: rtl8723bs: wifi_regd.c: Fix incorrect number of regulatory rules
ARM: dts: armada388-helios4: assign pinctrl to LEDs
ARM: dts: armada388-helios4: assign pinctrl to each fan
arm64: dts: msm8916: Fix reserved and rfsa nodes unit address
ARM: s3c: fix fiq for clang IAS
soc: aspeed: snoop: Add clock control logic
bpf_lru_list: Read double-checked variable once without lock
ath9k: fix data bus crash when setting nf_override via debugfs
ibmvnic: Set to CLOSED state even on error
bnxt_en: reverse order of TX disable and carrier off
xen/netback: fix spurious event detection for common event case
mac80211: fix potential overflow when multiplying to u32 integers
bpf: Fix bpf_fib_lookup helper MTU check for SKB ctx
tcp: fix SO_RCVLOWAT related hangs under mem pressure
cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds
b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
ibmvnic: add memory barrier to protect long term buffer
ibmvnic: skip send_request_unmap for timeout reset
net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout
net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warning
net: amd-xgbe: Reset link when the link never comes back
net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFP
net: mvneta: Remove per-cpu queue mapping for Armada 3700
fbdev: aty: SPARC64 requires FB_ATY_CT
drm/gma500: Fix error return code in psb_driver_load()
gma500: clean up error handling in init
crypto: sun4i-ss - fix kmap usage
drm/amdgpu: Fix macro name _AMDGPU_TRACE_H_ in preprocessor if condition
MIPS: c-r4k: Fix section mismatch for loongson2_sc_init
MIPS: lantiq: Explicitly compare LTQ_EBU_PCC_ISTAT against 0
media: i2c: ov5670: Fix PIXEL_RATE minimum value
media: camss: missing error code in msm_video_register()
media: vsp1: Fix an error handling path in the probe function
media: em28xx: Fix use-after-free in em28xx_alloc_urbs
media: media/pci: Fix memleak in empress_init
media: tm6000: Fix memleak in tm6000_start_stream
ASoC: cs42l56: fix up error handling in probe
crypto: bcm - Rename struct device_private to bcm_device_private
drm/amd/display: Fix 10/12 bpc setup in DCE output bit depth reduction.
media: lmedm04: Fix misuse of comma
media: qm1d1c0042: fix error return code in qm1d1c0042_init()
media: cx25821: Fix a bug when reallocating some dma memory
media: pxa_camera: declare variable when DEBUG is defined
media: uvcvideo: Accept invalid bFormatIndex and bFrameIndex values
crypto: talitos - Work around SEC6 ERRATA (AES-CTR mode data size error)
ata: ahci_brcm: Add back regulators management
ASoC: cpcap: fix microphone timeslot mask
f2fs: fix to avoid inconsistent quota data
drm/amdgpu: Prevent shift wrapping in amdgpu_read_mask()
Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind()
btrfs: clarify error returns values in __load_free_space_cache
hwrng: timeriomem - Fix cooldown period calculation
crypto: ecdh_helper - Ensure 'len >= secret.len' in decode_key()
ima: Free IMA measurement buffer on error
ima: Free IMA measurement buffer after kexec syscall
fs/jfs: fix potential integer overflow on shift of a int
jffs2: fix use after free in jffs2_sum_write_data()
capabilities: Don't allow writing ambiguous v3 file capabilities
clk: meson: clk-pll: fix initializing the old rate (fallback) for a PLL
quota: Fix memory leak when handling corrupted quota file
spi: cadence-quadspi: Abort read if dummy cycles required are too many
clk: sunxi-ng: h6: Fix CEC clock
HID: core: detect and skip invalid inputs to snto32()
dmaengine: fsldma: Fix a resource leak in the remove function
dmaengine: fsldma: Fix a resource leak in an error handling path of the probe function
dmaengine: owl-dma: Fix a resource leak in the remove function
dmaengine: hsu: disable spurious interrupt
mfd: bd9571mwv: Use devm_mfd_add_devices()
fdt: Properly handle "no-map" field in the memory region
of/fdt: Make sure no-map does not remove already reserved regions
power: reset: at91-sama5d2_shdwc: fix wkupdbc mask
rtc: s5m: select REGMAP_I2C
clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined
RDMA/mlx5: Use the correct obj_id upon DEVX TIR creation
clk: sunxi-ng: h6: Fix clock divider range on some clocks
regulator: axp20x: Fix reference cout leak
certs: Fix blacklist flag type confusion
spi: atmel: Put allocated master before return
regulator: s5m8767: Drop regulators OF node reference
isofs: release buffer head before return
auxdisplay: ht16k33: Fix refresh rate handling
IB/umad: Return EIO in case of when device disassociated
IB/umad: Return EPOLLERR in case of when device disassociated
KVM: PPC: Make the VMX instruction emulation routines static
powerpc/47x: Disable 256k page size
mmc: usdhi6rol0: Fix a resource leak in the error handling path of the probe
mmc: renesas_sdhi_internal_dmac: Fix DMA buffer alignment from 8 to 128-bytes
ARM: 9046/1: decompressor: Do not clear SCTLR.nTLSMD for ARMv7+ cores
amba: Fix resource leak for drivers without .remove
tracepoint: Do not fail unregistering a probe due to memory failure
perf tools: Fix DSO filtering when not finding a map for a sampled address
RDMA/rxe: Fix coding error in rxe_recv.c
RDMA/rxe: Correct skb on loopback path
spi: stm32: properly handle 0 byte transfer
mfd: wm831x-auxadc: Prevent use after free in wm831x_auxadc_read_irq()
powerpc/pseries/dlpar: handle ibm, configure-connector delay status
powerpc/8xx: Fix software emulation interrupt
clk: qcom: gcc-msm8998: Fix Alpha PLL type for all GPLLs
spi: pxa2xx: Fix the controller numbering for Wildcat Point
Input: sur40 - fix an error code in sur40_probe()
perf intel-pt: Fix missing CYC processing in PSB
perf test: Fix unaligned access in sample parsing test
Input: elo - fix an error code in elo_connect()
sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set
misc: eeprom_93xx46: Fix module alias to enable module autoprobe
misc: eeprom_93xx46: Add module alias to avoid breaking support for non device tree users
pwm: rockchip: rockchip_pwm_probe(): Remove superfluous clk_unprepare()
VMCI: Use set_page_dirty_lock() when unregistering guest memory
PCI: Align checking of syscall user config accessors
drm/msm/dsi: Correct io_start for MSM8994 (20nm PHY)
ext4: fix potential htree index checksum corruption
regmap: sdw: use _no_pm functions in regmap_read/write
i40e: Fix flow for IPv6 next header (extension header)
i40e: Add zero-initialization of AQ command structures
i40e: Fix overwriting flow control settings during driver loading
i40e: Fix VFs not created
i40e: Fix add TC filter for IPv6
net/mlx4_core: Add missed mlx4_free_cmd_mailbox()
vxlan: move debug check after netdev unregister
ocfs2: fix a use after free on error
mm/memory.c: fix potential pte_unmap_unlock pte error
mm/hugetlb: fix potential double free in hugetlb_register_node() error path
r8169: fix jumbo packet handling on RTL8168e
arm64: Add missing ISB after invalidating TLB in __primary_switch
i2c: brcmstb: Fix brcmstd_send_i2c_cmd condition
mm/rmap: fix potential pte_unmap on an not mapped pte
scsi: bnx2fc: Fix Kconfig warning & CNIC build errors
blk-settings: align max_sectors on "logical_block_size" boundary
ACPI: property: Fix fwnode string properties matching
ACPI: configfs: add missing check after configfs_register_default_group()
HID: wacom: Ignore attempts to overwrite the touch_max value from HID
Input: raydium_ts_i2c - do not send zero length
Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox Series X|S
Input: joydev - prevent potential read overflow in ioctl
Input: i8042 - add ASUS Zenbook Flip to noselftest list
USB: serial: option: update interface mapping for ZTE P685M
usb: musb: Fix runtime PM race in musb_queue_resume_work
usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
USB: serial: ftdi_sio: fix FTX sub-integer prescaler
USB: serial: mos7840: fix error code in mos7840_write()
USB: serial: mos7720: fix error code in mos7720_write()
ALSA: hda/realtek: modify EAPD in the ALC886
tpm_tis: Fix check_locality for correct locality acquisition
tpm_tis: Clean up locality release
KEYS: trusted: Fix migratable=1 failing
btrfs: abort the transaction if we fail to inc ref in btrfs_copy_root
btrfs: fix reloc root leak with 0 ref reloc roots on recovery
btrfs: fix extent buffer leak on failure to copy root
crypto: arm64/sha - add missing module aliases
crypto: sun4i-ss - checking sg length is not sufficient
crypto: sun4i-ss - handle BigEndian for cipher
seccomp: Add missing return in non-void function
misc: rtsx: init of rts522a add OCP power off when no card is present
drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue
pstore: Fix typo in compression option name
dts64: mt7622: fix slow sd card access
staging/mt7621-dma: mtk-hsdma.c->hsdma-mt7621.c
staging: gdm724x: Fix DMA from stack
staging: rtl8188eu: Add Edimax EW-7811UN V2 to device table
media: ipu3-cio2: Fix mbus_code processing in cio2_subdev_set_fmt()
x86/reboot: Force all cpus to exit VMX root if VMX is supported
floppy: reintroduce O_NDELAY fix
arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
watchdog: mei_wdt: request stop on unregister
mtd: spi-nor: hisi-sfc: Put child node np on error path
fs/affs: release old buffer head on error path
seq_file: document how per-entry resources are managed.
x86: fix seq_file iteration for pat/memtype.c
hugetlb: fix copy_huge_page_from_user contig page struct assumption
libnvdimm/dimm: Avoid race between probe and available_slots_show()
arm64: Extend workaround for erratum 1024718 to all versions of Cortex-A55
module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
mmc: sdhci-esdhc-imx: fix kernel panic when remove module
gpio: pcf857x: Fix missing first interrupt
printk: fix deadlock when kernel panic
cpufreq: intel_pstate: Get per-CPU max freq via MSR_HWP_CAPABILITIES if available
f2fs: fix out-of-repair __setattr_copy()
sparc32: fix a user-triggerable oops in clear_user()
gfs2: Don't skip dlm unlock if glock has an lvb
dm: fix deadlock when swapping to encrypted device
dm era: Recover committed writeset after crash
dm era: Verify the data block size hasn't changed
dm era: Fix bitset memory leaks
dm era: Use correct value size in equality function of writeset tree
dm era: Reinitialize bitset cache before digesting a new writeset
dm era: only resize metadata in preresume
icmp: introduce helper for nat'd source address in network device context
icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n
gtp: use icmp_ndo_send helper
sunvnet: use icmp_ndo_send helper
xfrm: interface: use icmp_ndo_send helper
ipv6: icmp6: avoid indirect call for icmpv6_send()
ipv6: silence compilation warning for non-IPV6 builds
net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
dm era: Update in-core bitset after committing the metadata
net: qrtr: Fix memory leak in qrtr_tun_open
ARM: dts: aspeed: Add LCLK to lpc-snoop
Linux 4.19.178
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8c07c10dd29a1233f238b533622d7b32bd22bdb0
|
|||
| 6d7fdbb979 |
bpf: Avoid warning when re-casting __bpf_call_base into __bpf_call_base_args
[ Upstream commit 6943c2b05bf09fd5c5729f7d7d803bf3f126cb9a ]
BPF interpreter uses extra input argument, so re-casts __bpf_call_base into
__bpf_call_base_args. Avoid compiler warning about incompatible function
prototypes by casting to void * first.
Fixes:
|
|||
| 588b1f9c92 |
Merge 4.19.133 into android-4.19-stable
Changes in 4.19.133 KVM: s390: reduce number of IO pins to 1 spi: spi-fsl-dspi: Adding shutdown hook spi: spi-fsl-dspi: Fix lockup if device is removed during SPI transfer spi: spi-fsl-dspi: use IRQF_SHARED mode to request IRQ spi: spi-fsl-dspi: Fix external abort on interrupt in resume or exit paths regmap: fix alignment issue ARM: dts: omap4-droid4: Fix spi configuration and increase rate drm/tegra: hub: Do not enable orphaned window group gpu: host1x: Detach driver on unregister spi: spidev: fix a race between spidev_release and spidev_remove spi: spidev: fix a potential use-after-free in spidev_release() ixgbe: protect ring accesses with READ- and WRITE_ONCE i40e: protect ring accesses with READ- and WRITE_ONCE drm: panel-orientation-quirks: Add quirk for Asus T101HA panel drm: panel-orientation-quirks: Use generic orientation-data for Acer S1003 s390/kasan: fix early pgm check handler execution cifs: update ctime and mtime during truncate ARM: imx6: add missing put_device() call in imx6q_suspend_init() scsi: mptscsih: Fix read sense data size usb: dwc3: pci: Fix reference count leak in dwc3_pci_resume_work block: release bip in a right way in error path nvme-rdma: assign completion vector correctly x86/entry: Increase entry_stack size to a full page net: qrtr: Fix an out of bounds read qrtr_endpoint_post() drm/mediatek: Check plane visibility in atomic_update net: cxgb4: fix return error value in t4_prep_fw smsc95xx: check return value of smsc95xx_reset smsc95xx: avoid memory leak in smsc95xx_bind net: hns3: fix use-after-free when doing self test ALSA: compress: fix partial_drain completion state arm64: kgdb: Fix single-step exception handling oops nbd: Fix memory leak in nbd_add_socket cxgb4: fix all-mask IP address comparison bnxt_en: fix NULL dereference in case SR-IOV configuration fails net: macb: mark device wake capable when "magic-packet" property present mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON() ALSA: opl3: fix infoleak in opl3 ALSA: hda - let hs_mic be picked ahead of hp_mic ALSA: usb-audio: add quirk for MacroSilicon MS2109 KVM: arm64: Fix definition of PAGE_HYP_DEVICE KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART KVM: x86: bit 8 of non-leaf PDPEs is not reserved KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode KVM: x86: Mark CR4.TSD as being possibly owned by the guest kallsyms: Refactor kallsyms_show_value() to take cred kernel: module: Use struct_size() helper module: Refactor section attr into bin attribute module: Do not expose section addresses to non-CAP_SYSLOG kprobes: Do not expose probe addresses to non-CAP_SYSLOG bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok() Revert "ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb" btrfs: fix fatal extent_buffer readahead vs releasepage race drm/radeon: fix double free dm: use noio when sending kobject event ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE ARC: elf: use right ELF_ARCH s390/mm: fix huge pte soft dirty copying Linux 4.19.133 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I0a0198d501017d2bc701b653d75dc9cedd1ebbd9 |
|||
| b80e052c3a |
bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()
commit 63960260457a02af2a6cb35d75e6bdb17299c882 upstream.
When evaluating access control over kallsyms visibility, credentials at
open() time need to be used, not the "current" creds (though in BPF's
case, this has likely always been the same). Plumb access to associated
file->f_cred down through bpf_dump_raw_ok() and its callers now that
kallsysm_show_value() has been refactored to take struct cred.
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes:
|
|||
| ebce5c1fb5 |
kallsyms: Refactor kallsyms_show_value() to take cred
commit 160251842cd35a75edfb0a1d76afa3eb674ff40a upstream. In order to perform future tests against the cred saved during open(), switch kallsyms_show_value() to operate on a cred, and have all current callers pass current_cred(). This makes it very obvious where callers are checking the wrong credential in their "read" contexts. These will be fixed in the coming patches. Additionally switch return value to bool, since it is always used as a direct permission check, not a 0-on-success, negative-on-error style function return. Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
|||
| 9a11e8da57 |
ANDROID: bpf: validate bpf_func when BPF_JIT is enabled with CFI
With CONFIG_BPF_JIT, the kernel makes indirect calls to dynamically generated code, which the compile-time Control-Flow Integrity (CFI) checking cannot validate. This change adds basic sanity checking to ensure we are jumping to a valid location, which narrows down the attack surface on the stored pointer. In addition, this change adds a weak arch_bpf_jit_check_func function, which architectures that implement BPF JIT can override to perform additional validation, such as verifying that the pointer points to the correct memory region. Bug: 140377409 Change-Id: I8ebac6637ab6bd9db44716b1c742add267298669 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> |
|||
| 54e8cf41b2 |
bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
[ Upstream commit fdadd04931c2d7cd294dc5b2b342863f94be53a3 ]
Michael and Sandipan report:
Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
and is adjusted again at init time if MODULES_VADDR is defined.
For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
the compile-time default at boot-time, which is 0x9c400000 when
using 64K page size. This overflows the signed 32-bit bpf_jit_limit
value:
root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
-1673527296
and can cause various unexpected failures throughout the network
stack. In one case `strace dhclient eth0` reported:
setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
16) = -1 ENOTSUPP (Unknown error 524)
and similar failures can be seen with tools like tcpdump. This doesn't
always reproduce however, and I'm not sure why. The more consistent
failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
host would time out on systemd/netplan configuring a virtio-net NIC
with no noticeable errors in the logs.
Given this and also given that in near future some architectures like
arm64 will have a custom area for BPF JIT image allocations we should
get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
so therefore add another overridable bpf_jit_alloc_exec_limit() helper
function which returns the possible size of the memory area for deriving
the default heuristic in bpf_jit_charge_init().
Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
JIT memory provider, and therefore in case archs implement their custom
module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.
Additionally, for archs supporting large page sizes, we should change
the sysctl to be handled as long to not run into sysctl restrictions
in future.
Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|||
| 8715ce033e |
x86/modules: Avoid breaking W^X while loading modules
[ Upstream commit f2c65fb3221adc6b73b0549fc7ba892022db9797 ] When modules and BPF filters are loaded, there is a time window in which some memory is both writable and executable. An attacker that has already found another vulnerability (e.g., a dangling pointer) might be able to exploit this behavior to overwrite kernel code. Prevent having writable executable PTEs in this stage. In addition, avoiding having W+X mappings can also slightly simplify the patching of modules code on initialization (e.g., by alternatives and static-key), as would be done in the next patch. This was actually the main motivation for this patch. To avoid having W+X mappings, set them initially as RW (NX) and after they are set as RO set them as X as well. Setting them as executable is done as a separate step to avoid one core in which the old PTE is cached (hence writable), and another which sees the updated PTE (executable), which would break the W^X protection. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <akpm@linux-foundation.org> Cc: <ard.biesheuvel@linaro.org> Cc: <deneen.t.dock@intel.com> Cc: <kernel-hardening@lists.openwall.com> Cc: <kristen@linux.intel.com> Cc: <linux_dti@icloud.com> Cc: <will.deacon@arm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Rik van Riel <riel@surriel.com> Link: https://lkml.kernel.org/r/20190426001143.4983-12-namit@vmware.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
|||
| 43caa29c99 |
bpf: add bpf_jit_limit knob to restrict unpriv allocations
commit ede95a63b5e84ddeea6b0c473b36ab8bfd8c6ce3 upstream. Rick reported that the BPF JIT could potentially fill the entire module space with BPF programs from unprivileged users which would prevent later attempts to load normal kernel modules or privileged BPF programs, for example. If JIT was enabled but unsuccessful to generate the image, then before commit |
|||
| ae92cf4760 |
bpf: fix missing prototype warnings
[ Upstream commit 116bfa96a255123ed209da6544f74a4f2eaca5da ]
Compiling with W=1 generates warnings:
CC kernel/bpf/core.o
kernel/bpf/core.c:721:12: warning: no previous prototype for ?bpf_jit_alloc_exec_limit? [-Wmissing-prototypes]
721 | u64 __weak bpf_jit_alloc_exec_limit(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~
kernel/bpf/core.c:757:14: warning: no previous prototype for ?bpf_jit_alloc_exec? [-Wmissing-prototypes]
757 | void *__weak bpf_jit_alloc_exec(unsigned long size)
| ^~~~~~~~~~~~~~~~~~
kernel/bpf/core.c:762:13: warning: no previous prototype for ?bpf_jit_free_exec? [-Wmissing-prototypes]
762 | void __weak bpf_jit_free_exec(void *addr)
| ^~~~~~~~~~~~~~~~~
All three are weak functions that archs can override, provide
proper prototypes for when a new arch provides their own.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|||
| 232ac70dd3 |
bpf: enable access to ax register also from verifier rewrite
[ commit 9b73bfdd08e73231d6a90ae6db4b46b3fbf56c30 upstream ] Right now we are using BPF ax register in JIT for constant blinding as well as in interpreter as temporary variable. Verifier will not be able to use it simply because its use will get overridden from the former in bpf_jit_blind_insn(). However, it can be made to work in that blinding will be skipped if there is prior use in either source or destination register on the instruction. Taking constraints of ax into account, the verifier is then open to use it in rewrites under some constraints. Note, ax register already has mappings in every eBPF JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org> |
|||
| b855e31037 |
bpf: move tmp variable into ax register in interpreter
[ commit 144cd91c4c2bced6eb8a7e25e590f6618a11e854 upstream ] This change moves the on-stack 64 bit tmp variable in ___bpf_prog_run() into the hidden ax register. The latter is currently only used in JITs for constant blinding as a temporary scratch register, meaning the BPF interpreter will never see the use of ax. Therefore it is safe to use it for the cases where tmp has been used earlier. This is needed to later on allow restricted hidden use of ax in both interpreter and JITs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org> |
|||
| 464b01e440 |
bpf: Allow narrow loads with offset > 0
[ Upstream commit 46f53a65d2de3e1591636c22b626b09d8684fd71 ]
Currently BPF verifier allows narrow loads for a context field only with
offset zero. E.g. if there is a __u32 field then only the following
loads are permitted:
* off=0, size=1 (narrow);
* off=0, size=2 (narrow);
* off=0, size=4 (full).
On the other hand LLVM can generate a load with offset different than
zero that make sense from program logic point of view, but verifier
doesn't accept it.
E.g. tools/testing/selftests/bpf/sendmsg4_prog.c has code:
#define DST_IP4 0xC0A801FEU /* 192.168.1.254 */
...
if ((ctx->user_ip4 >> 24) == (bpf_htonl(DST_IP4) >> 24) &&
where ctx is struct bpf_sock_addr.
Some versions of LLVM can produce the following byte code for it:
8: 71 12 07 00 00 00 00 00 r2 = *(u8 *)(r1 + 7)
9: 67 02 00 00 18 00 00 00 r2 <<= 24
10: 18 03 00 00 00 00 00 fe 00 00 00 00 00 00 00 00 r3 = 4261412864 ll
12: 5d 32 07 00 00 00 00 00 if r2 != r3 goto +7 <LBB0_6>
where `*(u8 *)(r1 + 7)` means narrow load for ctx->user_ip4 with size=1
and offset=3 (7 - sizeof(ctx->user_family) = 3). This load is currently
rejected by verifier.
Verifier code that rejects such loads is in bpf_ctx_narrow_access_ok()
what means any is_valid_access implementation, that uses the function,
works this way, e.g. bpf_skb_is_valid_access() for __sk_buff or
sock_addr_is_valid_access() for bpf_sock_addr.
The patch makes such loads supported. Offset can be in [0; size_default)
but has to be multiple of load size. E.g. for __u32 field the following
loads are supported now:
* off=0, size=1 (narrow);
* off=1, size=1 (narrow);
* off=2, size=1 (narrow);
* off=3, size=1 (narrow);
* off=0, size=2 (narrow);
* off=2, size=2 (narrow);
* off=0, size=4 (full).
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|||
| f6069b9aa9 |
bpf: fix redirect to map under tail calls
Commits |
|||
| 8217ca653e |
bpf: Enable BPF_PROG_TYPE_SK_REUSEPORT bpf prog in reuseport selection
This patch allows a BPF_PROG_TYPE_SK_REUSEPORT bpf prog to select a SO_REUSEPORT sk from a BPF_MAP_TYPE_REUSEPORT_ARRAY introduced in the earlier patch. "bpf_run_sk_reuseport()" will return -ECONNREFUSED when the BPF_PROG_TYPE_SK_REUSEPORT prog returns SK_DROP. The callers, in inet[6]_hashtable.c and ipv[46]/udp.c, are modified to handle this case and return NULL immediately instead of continuing the sk search from its hashtable. It re-uses the existing SO_ATTACH_REUSEPORT_EBPF setsockopt to attach BPF_PROG_TYPE_SK_REUSEPORT. The "sk_reuseport_attach_bpf()" will check if the attaching bpf prog is in the new SK_REUSEPORT or the existing SOCKET_FILTER type and then check different things accordingly. One level of "__reuseport_attach_prog()" call is removed. The "sk_unhashed() && ..." and "sk->sk_reuseport_cb" tests are pushed back to "reuseport_attach_prog()" in sock_reuseport.c. sock_reuseport.c seems to have more knowledge on those test requirements than filter.c. In "reuseport_attach_prog()", after new_prog is attached to reuse->prog, the old_prog (if any) is also directly freed instead of returning the old_prog to the caller and asking the caller to free. The sysctl_optmem_max check is moved back to the "sk_reuseport_attach_filter()" and "sk_reuseport_attach_bpf()". As of other bpf prog types, the new BPF_PROG_TYPE_SK_REUSEPORT is only bounded by the usual "bpf_prog_charge_memlock()" during load time instead of bounded by both bpf_prog_charge_memlock and sysctl_optmem_max. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| 2dbb9b9e6d |
bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT
This patch adds a BPF_PROG_TYPE_SK_REUSEPORT which can select a SO_REUSEPORT sk from a BPF_MAP_TYPE_REUSEPORT_ARRAY. Like other non SK_FILTER/CGROUP_SKB program, it requires CAP_SYS_ADMIN. BPF_PROG_TYPE_SK_REUSEPORT introduces "struct sk_reuseport_kern" to store the bpf context instead of using the skb->cb[48]. At the SO_REUSEPORT sk lookup time, it is in the middle of transiting from a lower layer (ipv4/ipv6) to a upper layer (udp/tcp). At this point, it is not always clear where the bpf context can be appended in the skb->cb[48] to avoid saving-and-restoring cb[]. Even putting aside the difference between ipv4-vs-ipv6 and udp-vs-tcp. It is not clear if the lower layer is only ipv4 and ipv6 in the future and will it not touch the cb[] again before transiting to the upper layer. For example, in udp_gro_receive(), it uses the 48 byte NAPI_GRO_CB instead of IP[6]CB and it may still modify the cb[] after calling the udp[46]_lib_lookup_skb(). Because of the above reason, if sk->cb is used for the bpf ctx, saving-and-restoring is needed and likely the whole 48 bytes cb[] has to be saved and restored. Instead of saving, setting and restoring the cb[], this patch opts to create a new "struct sk_reuseport_kern" and setting the needed values in there. The new BPF_PROG_TYPE_SK_REUSEPORT and "struct sk_reuseport_(kern|md)" will serve all ipv4/ipv6 + udp/tcp combinations. There is no protocol specific usage at this point and it is also inline with the current sock_reuseport.c implementation (i.e. no protocol specific requirement). In "struct sk_reuseport_md", this patch exposes data/data_end/len with semantic similar to other existing usages. Together with "bpf_skb_load_bytes()" and "bpf_skb_load_bytes_relative()", the bpf prog can peek anywhere in the skb. The "bind_inany" tells the bpf prog that the reuseport group is bind-ed to a local INANY address which cannot be learned from skb. The new "bind_inany" is added to "struct sock_reuseport" which will be used when running the new "BPF_PROG_TYPE_SK_REUSEPORT" bpf prog in order to avoid repeating the "bind INANY" test on "sk_v6_rcv_saddr/sk->sk_rcv_saddr" every time a bpf prog is run. It can only be properly initialized when a "sk->sk_reuseport" enabled sk is adding to a hashtable (i.e. during "reuseport_alloc()" and "reuseport_add_sock()"). The new "sk_select_reuseport()" is the main helper that the bpf prog will use to select a SO_REUSEPORT sk. It is the only function that can use the new BPF_MAP_TYPE_REUSEPORT_ARRAY. As mentioned in the earlier patch, the validity of a selected sk is checked in run time in "sk_select_reuseport()". Doing the check in verification time is difficult and inflexible (consider the map-in-map use case). The runtime check is to compare the selected sk's reuseport_id with the reuseport_id that we want. This helper will return -EXXX if the selected sk cannot serve the incoming request (e.g. reuseport_id not match). The bpf prog can decide if it wants to do SK_DROP as its discretion. When the bpf prog returns SK_PASS, the kernel will check if a valid sk has been selected (i.e. "reuse_kern->selected_sk != NULL"). If it does , it will use the selected sk. If not, the kernel will select one from "reuse->socks[]" (as before this patch). The SK_DROP and SK_PASS handling logic will be in the next patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| 2539650fad |
xdp: Helpers for disabling napi_direct of xdp_return_frame
We need some mechanism to disable napi_direct on calling xdp_return_frame_rx_napi() from some context. When veth gets support of XDP_REDIRECT, it will redirects packets which are redirected from other devices. On redirection veth will reuse xdp_mem_info of the redirection source device to make return_frame work. But in this case .ndo_xdp_xmit() called from veth redirection uses xdp_mem_info which is not guarded by NAPI, because the .ndo_xdp_xmit() is not called directly from the rxq which owns the xdp_mem_info. This approach introduces a flag in bpf_redirect_info to indicate that napi_direct should be disabled even when _rx_napi variant is used as well as helper functions to use it. A NAPI handler who wants to use this flag needs to call xdp_set_return_frame_no_direct() before processing packets, and call xdp_clear_return_frame_no_direct() after xdp_do_flush_map() before exiting NAPI. v4: - Use bpf_redirect_info for storing the flag instead of xdp_mem_info to avoid per-frame copy cost. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| 0b19cc0a86 |
bpf: Make redirect_info accessible from modules
We are going to add kern_flags field in redirect_info for kernel internal use. In order to avoid function call to access the flags, make redirect_info accessible from modules. Also as it is now non-static, add prefix bpf_ to redirect_info. v6: - Fix sparse warning around EXPORT_SYMBOL. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| d8d7218ad8 |
xdp: XDP_REDIRECT should check IFF_UP and MTU
Otherwise we end up with attempting to send packets from down devices
or to send oversized packets, which may cause unexpected driver/device
behaviour. Generic XDP has already done this check, so reuse the logic
in native XDP.
Fixes:
|
|||
| 85782e037f |
bpf: undo prog rejection on read-only lock failure
Partially undo commit |
|||
| 9262478220 |
bpf: enforce correct alignment for instructions
After commit |
|||
| 6d5fc19579 |
xdp: Fix handling of devmap in generic XDP
Commit |
|||
| 9facc33687 |
bpf: reject any prog that failed read-only lock
We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro()
as well as the BPF image as read-only through bpf_prog_lock_ro(). In
the case any of these would fail we throw a WARN_ON_ONCE() in order to
yell loudly to the log. Perhaps, to some extend, this may be comparable
to an allocation where __GFP_NOWARN is explicitly not set.
Added via
|
|||
| 7d1982b4e3 |
bpf: fix panic in prog load calls cleanup
While testing I found that when hitting error path in bpf_prog_load()
where we jump to free_used_maps and prog contained BPF to BPF calls
that were JITed earlier, then we never clean up the bpf_prog_kallsyms_add()
done under jit_subprogs(). Add proper API to make BPF kallsyms deletion
more clear and fix that.
Fixes:
|
|||
| bc23105ca0 |
bpf: fix context access in tracing progs on 32 bit archs
Wang reported that all the testcases for BPF_PROG_TYPE_PERF_EVENT program type in test_verifier report the following errors on x86_32: 172/p unpriv: spill/fill of different pointers ldx FAIL Unexpected error message! 0: (bf) r6 = r10 1: (07) r6 += -8 2: (15) if r1 == 0x0 goto pc+3 R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1 3: (bf) r2 = r10 4: (07) r2 += -76 5: (7b) *(u64 *)(r6 +0) = r2 6: (55) if r1 != 0x0 goto pc+1 R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp 7: (7b) *(u64 *)(r6 +0) = r1 8: (79) r1 = *(u64 *)(r6 +0) 9: (79) r1 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (71) r0 = *(u8 *)(r1 +68) invalid bpf_context access off=68 size=1 379/p check bpf_perf_event_data->sample_period half load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (69) r0 = *(u16 *)(r1 +68) invalid bpf_context access off=68 size=2 380/p check bpf_perf_event_data->sample_period word load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (61) r0 = *(u32 *)(r1 +68) invalid bpf_context access off=68 size=4 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (79) r0 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 Reason is that struct pt_regs on x86_32 doesn't fully align to 8 byte boundary due to its size of 68 bytes. Therefore, bpf_ctx_narrow_access_ok() will then bail out saying that off & (size_default - 1) which is 68 & 7 doesn't cleanly align in the case of sample_period access from struct bpf_perf_event_data, hence verifier wrongly thinks we might be doing an unaligned access here though underlying arch can handle it just fine. Therefore adjust this down to machine size and check and rewrite the offset for narrow access on that basis. We also need to fix corresponding pe_prog_is_valid_access(), since we hit the check for off % size != 0 (e.g. 68 % 8 -> 4) in the first and last test. With that in place, progs for tracing work on x86_32. Reported-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
|||
| 09772d92cd |
bpf: avoid retpoline for lookup/update/delete calls on maps
While some of the BPF map lookup helpers provide a ->map_gen_lookup()
callback for inlining the map lookup altogether it is not available
for every map, so the remaining ones have to call bpf_map_lookup_elem()
helper which does a dispatch to map->ops->map_lookup_elem(). In
times of retpolines, this will control and trap speculative execution
rather than letting it do its work for the indirect call and will
therefore cause a slowdown. Likewise, bpf_map_update_elem() and
bpf_map_delete_elem() do not have an inlined version and need to call
into their map->ops->map_update_elem() resp. map->ops->map_delete_elem()
handlers.
Before:
# bpftool prog dump xlated id 1
0: (bf) r2 = r10
1: (07) r2 += -8
2: (7a) *(u64 *)(r2 +0) = 0
3: (18) r1 = map[id:1]
5: (85) call __htab_map_lookup_elem#232656
6: (15) if r0 == 0x0 goto pc+4
7: (71) r1 = *(u8 *)(r0 +35)
8: (55) if r1 != 0x0 goto pc+1
9: (72) *(u8 *)(r0 +35) = 1
10: (07) r0 += 56
11: (15) if r0 == 0x0 goto pc+4
12: (bf) r2 = r0
13: (18) r1 = map[id:1]
15: (85) call bpf_map_delete_elem#215008 <-- indirect call via
16: (95) exit helper
After:
# bpftool prog dump xlated id 1
0: (bf) r2 = r10
1: (07) r2 += -8
2: (7a) *(u64 *)(r2 +0) = 0
3: (18) r1 = map[id:1]
5: (85) call __htab_map_lookup_elem#233328
6: (15) if r0 == 0x0 goto pc+4
7: (71) r1 = *(u8 *)(r0 +35)
8: (55) if r1 != 0x0 goto pc+1
9: (72) *(u8 *)(r0 +35) = 1
10: (07) r0 += 56
11: (15) if r0 == 0x0 goto pc+4
12: (bf) r2 = r0
13: (18) r1 = map[id:1]
15: (85) call htab_lru_map_delete_elem#238240 <-- direct call
16: (95) exit
In all three lookup/update/delete cases however we can use the actual
address of the map callback directly if we find that there's only a
single path with a map pointer leading to the helper call, meaning
when the map pointer has not been poisoned from verifier side.
Example code can be seen above for the delete case.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|||
| 06be0864c7 |
bpf: test case for map pointer poison with calls/branches
Add several test cases where the same or different map pointers originate from different paths in the program and execute a map lookup or tail call at a common location. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
|||
| 1cedee13d2 |
bpf: Hooks for sys_sendmsg
In addition to already existing BPF hooks for sys_bind and sys_connect, the patch provides new hooks for sys_sendmsg. It leverages existing BPF program type `BPF_PROG_TYPE_CGROUP_SOCK_ADDR` that provides access to socket itlself (properties like family, type, protocol) and user-passed `struct sockaddr *` so that BPF program can override destination IP and port for system calls such as sendto(2) or sendmsg(2) and/or assign source IP to the socket. The hooks are implemented as two new attach types: `BPF_CGROUP_UDP4_SENDMSG` and `BPF_CGROUP_UDP6_SENDMSG` for UDPv4 and UDPv6 correspondingly. UDPv4 and UDPv6 separate attach types for same reason as sys_bind and sys_connect hooks, i.e. to prevent reading from / writing to e.g. user_ip6 fields when user passes sockaddr_in since it'd be out-of-bound. The difference with already existing hooks is sys_sendmsg are implemented only for unconnected UDP. For TCP it doesn't make sense to change user-provided `struct sockaddr *` at sendto(2)/sendmsg(2) time since socket either was already connected and has source/destination set or wasn't connected and call to sendto(2)/sendmsg(2) would lead to ENOTCONN anyway. Connected UDP is already handled by sys_connect hooks that can override source/destination at connect time and use fast-path later, i.e. these hooks don't affect UDP fast-path. Rewriting source IP is implemented differently than that in sys_connect hooks. When sys_sendmsg is used with unconnected UDP it doesn't work to just bind socket to desired local IP address since source IP can be set on per-packet basis by using ancillary data (cmsg(3)). So no matter if socket is bound or not, source IP has to be rewritten on every call to sys_sendmsg. To do so two new fields are added to UAPI `struct bpf_sock_addr`; * `msg_src_ip4` to set source IPv4 for UDPv4; * `msg_src_ip6` to set source IPv6 for UDPv6. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| 303def35f6 |
bpf: allow sk_msg programs to read sock fields
Currently sk_msg programs only have access to the raw data. However, it is often useful when building policies to have the policies specific to the socket endpoint. This allows using the socket tuple as input into filters, etc. This patch adds ctx access to the sock fields. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| e5cd3abcb3 |
bpf: sockmap, refactor sockmap routines to work with hashmap
This patch only refactors the existing sockmap code. This will allow
much of the psock initialization code path and bpf helper codes to
work for both sockmap bpf map types that are backed by an array, the
currently supported type, and the new hash backed bpf map type
sockhash.
Most the fallout comes from three changes,
- Pushing bpf programs into an independent structure so we
can use it from the htab struct in the next patch.
- Generalizing helpers to use void *key instead of the hardcoded
u32.
- Instead of passing map/key through the metadata we now do
the lookup inline. This avoids storing the key in the metadata
which will be useful when keys can be longer than 4 bytes. We
rename the sk pointers to sk_redir at this point as well to
avoid any confusion between the current sk pointer and the
redirect pointer sk_redir.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|||
| e0cea7ce98 |
bpf: implement ld_abs/ld_ind in native bpf
The main part of this work is to finally allow removal of LD_ABS and LD_IND from the BPF core by reimplementing them through native eBPF instead. Both LD_ABS/LD_IND were carried over from cBPF and keeping them around in native eBPF caused way more trouble than actually worth it. To just list some of the security issues in the past: * |
|||
| 02671e23e7 |
xsk: wire up XDP_SKB side of AF_XDP
This commit wires up the xskmap to XDP_SKB layer. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
|||
| c195651e56 |
bpf: add bpf_get_stack helper
Currently, stackmap and bpf_get_stackid helper are provided for bpf program to get the stack trace. This approach has a limitation though. If two stack traces have the same hash, only one will get stored in the stackmap table, so some stack traces are missing from user perspective. This patch implements a new helper, bpf_get_stack, will send stack traces directly to bpf program. The bpf program is able to see all stack traces, and then can do in-kernel processing or send stack traces to user space through shared map or bpf_perf_event_output. Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
|||
| 106ca27f29 |
xdp: move struct xdp_buff from filter.h to xdp.h
This is done to prepare for the next patch, and it is also nice to move this XDP related struct out of filter.h. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|||
| 4fbac77d2d |
bpf: Hooks for sys_bind
== The problem == There is a use-case when all processes inside a cgroup should use one single IP address on a host that has multiple IP configured. Those processes should use the IP for both ingress and egress, for TCP and UDP traffic. So TCP/UDP servers should be bound to that IP to accept incoming connections on it, and TCP/UDP clients should make outgoing connections from that IP. It should not require changing application code since it's often not possible. Currently it's solved by intercepting glibc wrappers around syscalls such as `bind(2)` and `connect(2)`. It's done by a shared library that is preloaded for every process in a cgroup so that whenever TCP/UDP server calls `bind(2)`, the library replaces IP in sockaddr before passing arguments to syscall. When application calls `connect(2)` the library transparently binds the local end of connection to that IP (`bind(2)` with `IP_BIND_ADDRESS_NO_PORT` to avoid performance penalty). Shared library approach is fragile though, e.g.: * some applications clear env vars (incl. `LD_PRELOAD`); * `/etc/ld.so.preload` doesn't help since some applications are linked with option `-z nodefaultlib`; * other applications don't use glibc and there is nothing to intercept. == The solution == The patch provides much more reliable in-kernel solution for the 1st part of the problem: binding TCP/UDP servers on desired IP. It does not depend on application environment and implementation details (whether glibc is used or not). It adds new eBPF program type `BPF_PROG_TYPE_CGROUP_SOCK_ADDR` and attach types `BPF_CGROUP_INET4_BIND` and `BPF_CGROUP_INET6_BIND` (similar to already existing `BPF_CGROUP_INET_SOCK_CREATE`). The new program type is intended to be used with sockets (`struct sock`) in a cgroup and provided by user `struct sockaddr`. Pointers to both of them are parts of the context passed to programs of newly added types. The new attach types provides hooks in `bind(2)` system call for both IPv4 and IPv6 so that one can write a program to override IP addresses and ports user program tries to bind to and apply such a program for whole cgroup. == Implementation notes == [1] Separate attach types for `AF_INET` and `AF_INET6` are added intentionally to prevent reading/writing to offsets that don't make sense for corresponding socket family. E.g. if user passes `sockaddr_in` it doesn't make sense to read from / write to `user_ip6[]` context fields. [2] The write access to `struct bpf_sock_addr_kern` is implemented using special field as an additional "register". There are just two registers in `sock_addr_convert_ctx_access`: `src` with value to write and `dst` with pointer to context that can't be changed not to break later instructions. But the fields, allowed to write to, are not available directly and to access them address of corresponding pointer has to be loaded first. To get additional register the 1st not used by `src` and `dst` one is taken, its content is saved to `bpf_sock_addr_kern.tmp_reg`, then the register is used to load address of pointer field, and finally the register's content is restored from the temporary field after writing `src` value. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| 5e43f899b0 |
bpf: Check attach type at prog load time
== The problem == There are use-cases when a program of some type can be attached to multiple attach points and those attach points must have different permissions to access context or to call helpers. E.g. context structure may have fields for both IPv4 and IPv6 but it doesn't make sense to read from / write to IPv6 field when attach point is somewhere in IPv4 stack. Same applies to BPF-helpers: it may make sense to call some helper from some attach point, but not from other for same prog type. == The solution == Introduce `expected_attach_type` field in in `struct bpf_attr` for `BPF_PROG_LOAD` command. If scenario described in "The problem" section is the case for some prog type, the field will be checked twice: 1) At load time prog type is checked to see if attach type for it must be known to validate program permissions correctly. Prog will be rejected with EINVAL if it's the case and `expected_attach_type` is not specified or has invalid value. 2) At attach time `attach_type` is compared with `expected_attach_type`, if prog type requires to have one, and, if they differ, attach will be rejected with EINVAL. The `expected_attach_type` is now available as part of `struct bpf_prog` in both `bpf_verifier_ops->is_valid_access()` and `bpf_verifier_ops->get_func_proto()` () and can be used to check context accesses and calls to helpers correspondingly. Initially the idea was discussed by Alexei Starovoitov <ast@fb.com> and Daniel Borkmann <daniel@iogearbox.net> here: https://marc.info/?l=linux-netdev&m=152107378717201&w=2 Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| fa246693a1 |
bpf: sockmap, BPF_F_INGRESS flag for BPF_SK_SKB_STREAM_VERDICT:
Add support for the BPF_F_INGRESS flag in skb redirect helper. To do this convert skb into a scatterlist and push into ingress queue. This is the same logic that is used in the sk_msg redirect helper so it should feel familiar. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| 8934ce2fd0 |
bpf: sockmap redirect ingress support
Add support for the BPF_F_INGRESS flag in sk_msg redirect helper. To do this add a scatterlist ring for receiving socks to check before calling into regular recvmsg call path. Additionally, because the poll wakeup logic only checked the skb recv queue we need to add a hook in TCP stack (similar to write side) so that we have a way to wake up polling socks when a scatterlist is redirected to that sock. After this all that is needed is for the redirect helper to push the scatterlist into the psock receive queue. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| e59ac63490 |
bpf: add parenthesis around argument of BPF_LDST_BYTES()
BPF_LDST_BYTES() does not put it's argument in parenthesis when referencing it. This makes it impossible to pass pointers obtained by address-of operator (e.g. BPF_LDST_BYTES(&insn)). Add the parenthesis. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
|||
| 4f738adba3 |
bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
This implements a BPF ULP layer to allow policy enforcement and monitoring at the socket layer. In order to support this a new program type BPF_PROG_TYPE_SK_MSG is used to run the policy at the sendmsg/sendpage hook. To attach the policy to sockets a sockmap is used with a new program attach type BPF_SK_MSG_VERDICT. Similar to previous sockmap usages when a sock is added to a sockmap, via a map update, if the map contains a BPF_SK_MSG_VERDICT program type attached then the BPF ULP layer is created on the socket and the attached BPF_PROG_TYPE_SK_MSG program is run for every msg in sendmsg case and page/offset in sendpage case. BPF_PROG_TYPE_SK_MSG Semantics/API: BPF_PROG_TYPE_SK_MSG supports only two return codes SK_PASS and SK_DROP. Returning SK_DROP free's the copied data in the sendmsg case and in the sendpage case leaves the data untouched. Both cases return -EACESS to the user. Returning SK_PASS will allow the msg to be sent. In the sendmsg case data is copied into kernel space buffers before running the BPF program. The kernel space buffers are stored in a scatterlist object where each element is a kernel memory buffer. Some effort is made to coalesce data from the sendmsg call here. For example a sendmsg call with many one byte iov entries will likely be pushed into a single entry. The BPF program is run with data pointers (start/end) pointing to the first sg element. In the sendpage case data is not copied. We opt not to copy the data by default here, because the BPF infrastructure does not know what bytes will be needed nor when they will be needed. So copying all bytes may be wasteful. Because of this the initial start/end data pointers are (0,0). Meaning no data can be read or written. This avoids reading data that may be modified by the user. A new helper is added later in this series if reading and writing the data is needed. The helper call will do a copy by default so that the page is exclusively owned by the BPF call. The verdict from the BPF_PROG_TYPE_SK_MSG applies to the entire msg in the sendmsg() case and the entire page/offset in the sendpage case. This avoids ambiguity on how to handle mixed return codes in the sendmsg case. Again a helper is added later in the series if a verdict needs to apply to multiple system calls and/or only a subpart of the currently being processed message. The helper msg_redirect_map() can be used to select the socket to send the data on. This is used similar to existing redirect use cases. This allows policy to redirect msgs. Pseudo code simple example: The basic logic to attach a program to a socket is as follows, // load the programs bpf_prog_load(SOCKMAP_TCP_MSG_PROG, BPF_PROG_TYPE_SK_MSG, &obj, &msg_prog); // lookup the sockmap bpf_map_msg = bpf_object__find_map_by_name(obj, "my_sock_map"); // get fd for sockmap map_fd_msg = bpf_map__fd(bpf_map_msg); // attach program to sockmap bpf_prog_attach(msg_prog, map_fd_msg, BPF_SK_MSG_VERDICT, 0); Adding sockets to the map is done in the normal way, // Add a socket 'fd' to sockmap at location 'i' bpf_map_update_elem(map_fd_msg, &i, fd, BPF_ANY); After the above any socket attached to "my_sock_map", in this case 'fd', will run the BPF msg verdict program (msg_prog) on every sendmsg and sendpage system call. For a complete example see BPF selftests or sockmap samples. Implementation notes: It seemed the simplest, to me at least, to use a refcnt to ensure psock is not lost across the sendmsg copy into the sg, the bpf program running on the data in sg_data, and the final pass to the TCP stack. Some performance testing may show a better method to do this and avoid the refcnt cost, but for now use the simpler method. Another item that will come after basic support is in place is supporting MSG_MORE flag. At the moment we call sendpages even if the MSG_MORE flag is set. An enhancement would be to collect the pages into a larger scatterlist and pass down the stack. Notice that bpf_tcp_sendmsg() could support this with some additional state saved across sendmsg calls. I built the code to support this without having to do refactoring work. Other features TBD include ZEROCOPY and the TCP_RECV_QUEUE/TCP_NO_QUEUE support. This will follow initial series shortly. Future work could improve size limits on the scatterlist rings used here. Currently, we use MAX_SKB_FRAGS simply because this was being used already in the TLS case. Future work could extend the kernel sk APIs to tune this depending on workload. This is a trade-off between memory usage and throughput performance. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|||
| 297dd12cb1 |
net: avoid including xdp.h in filter.h
If is sufficient with a forward declaration of struct xdp_rxq_info in linux/filter.h, which avoids including net/xdp.h. This was originally suggested by John Fastabend during the review phase, but wasn't included in the final patchset revision. Thus, this followup. Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |