diff -urN linux.orig/CREDITS linux/CREDITS --- linux.orig/CREDITS Mon Oct 21 13:25:57 2002 +++ linux/CREDITS Mon Oct 21 13:27:53 2002 @@ -2161,9 +2161,15 @@ N: Corey Minyard E: minyard@wf-rch.cirr.com +E: minyard@mvista.com +W: http://home.attbi.com/~minyard D: Sony CDU31A CDROM Driver -S: 1805 Marquette -S: Richardson, Texas 75081 +D: IPMI driver +D: Various networking fixes long ago +D: Original ppc_md work +D: Shared zlib +S: 7406 Wheat Field Rd +S: Garland, Texas 75066 S: USA N: Patrick Mochel diff -urN linux.orig/Documentation/IPMI.txt linux/Documentation/IPMI.txt --- linux.orig/Documentation/IPMI.txt Wed Dec 31 18:00:00 1969 +++ linux/Documentation/IPMI.txt Tue Oct 22 09:35:21 2002 @@ -0,0 +1,317 @@ + + The Linux IPMI Driver + --------------------- + Corey Minyard + <minyard@mvista.com> + <minyard@acm.org> + +This document describes how to use the IPMI driver for Linux. If you +are not familiar with IPMI itself, see the web site at +http://www.intel.com/design/servers/ipmi/index.htm. IPMI is a big +subject and I can't cover it all here! + +Basic Design +------------ + +The Linux IPMI driver is designed to be very modular and flexible, you +only need to take the pieces you need and you can use it in many +different ways. Because of that, it's broken into many chunks of +code. These chunks are: + +ipmi_msghandler - This is the central piece of software for the IPMI +system. It handles all messages, message timing, and responses. The +IPMI users tie into this, and the IPMI physical interfaces (called +System Management Interfaces, or SMIs) also tie in here. This +provides the kernelland interface for IPMI, but does not provide an +interface for use by application processes. + +ipmi_devintf - This provides a userland IOCTL interface for the IPMI +driver, each open file for this device ties in to the message handler +as an IPMI user. + +ipmi_kcs_drv - A driver for the KCS SMI. Most system have a KCS +interface for IPMI. + + +Much documentation for the interface is in the include files. The +IPMI include files are: + +ipmi.h - Contains the user interface and IOCTL interface for IPMI. + +ipmi_smi.h - Contains the interface for SMI drivers to use. + +ipmi_msgdefs.h - General definitions for base IPMI messaging. + + +Addressing +---------- + +The IPMI addressing works much like IP addresses, you have an overlay +to handle the different address types. The overlay is: + + struct ipmi_addr + { + int addr_type; + short channel; + char data[IPMI_MAX_ADDR_SIZE]; + }; + +The addr_type determines what the address really is. The driver +currently understands two different types of addresses. + +"System Interface" addresses are defined as: + + struct ipmi_system_interface_addr + { + int addr_type; + short channel; + }; + +and the type is IPMI_SYSTEM_INTERFACE_ADDR_TYPE. This is used for talking +straight to the BMC on the current card. The channel must be +IPMI_BMC_CHANNEL. + +Messages that are destined to go out on the IPMB bus use the +IPMI_IPMB_ADDR_TYPE address type. The format is + + struct ipmi_ipmb_addr + { + int addr_type; + short channel; + unsigned char slave_addr; + unsigned char lun; + }; + +The "channel" here is generally zero, but some devices support more +than one channel, it corresponds to the channel as defined in the IPMI +spec. + + +Messages +-------- + +Messages are defined as: + +struct ipmi_msg +{ + unsigned char netfn; + unsigned char lun; + unsigned char cmd; + unsigned char *data; + int data_len; +}; + +The driver takes care of adding/stripping the header information. The +data portion is just the data to be send (do NOT put addressing info +here) or the response. Note that the completion code of a response is +the first item in "data", it is not stripped out because that is how +all the messages are defined in the spec (and thus makes counting the +offsets a little easier :-). + +When using the IOCTL interface from userland, you must provide a block +of data for "data", fill it, and set data_len to the length of the +block of data, even when receiving messages. Otherwise the driver +will have no place to put the message. + +Messages coming up from the message handler in kernelland will come in +as: + + struct ipmi_recv_msg + { + struct list_head link; + + /* The type of message as defined in the "Receive Types" + defines above. */ + int recv_type; + + ipmi_user_t *user; + struct ipmi_addr addr; + long msgid; + struct ipmi_msg msg; + + /* Call this when done with the message. It will presumably free + the message and do any other necessary cleanup. */ + void (*done)(struct ipmi_recv_msg *msg); + + /* Place-holder for the data, don't make any assumptions about + the size or existence of this, since it may change. */ + unsigned char msg_data[IPMI_MAX_MSG_LENGTH]; + }; + +You should look at the receive type and handle the message +appropriately. + + +The Upper Layer Interface (Message Handler) +------------------------------------------- + +The upper layer of the interface provides the users with a consistent +view of the IPMI interfaces. It allows multiple SMI interfaces to be +addressed (because some boards actually have multiple BMCs on them) +and the user should not have to care what type of SMI is below them. + + +Creating the User + +To user the message handler, you must first create a user using +ipmi_create_user. The interface number specifies which SMI you want +to connect to, and you must supply callback functions to be called +when data comes in. The callback function can run at interrupt level, +so be careful using the callbacks. This also allows to you pass in a +piece of data, the handler_data, that will be passed back to you on +all calls. + +Once you are done, call ipmi_destroy_user() to get rid of the user. + +From userland, opening the device automatically creates a user, and +closing the device automatically destroys the user. + + +Messaging + +To send a message from kernel-land, the ipmi_request() call does +pretty much all message handling. Most of the parameter are +self-explanatory. However, it takes a "msgid" parameter. This is NOT +the sequence number of messages. It is simply a long value that is +passed back when the response for the message is returned. You may +use it for anything you like. + +Responses come back in the function pointed to by the ipmi_recv_hndl +field of the "handler" that you passed in to ipmi_create_user(). +Remember again, these may be running at interrupt level. Remember to +look at the receive type, too. + +From userland, you fill out an ipmi_req_t structure and use the +IPMICTL_SEND_COMMAND ioctl. For incoming stuff, you can use select() +or poll() to wait for messages to come in. However, you cannot use +read() to get them, you must call the IPMICTL_RECEIVE_MSG with the +ipmi_recv_t structure to actually get the message. Remember that you +must supply a pointer to a block of data in the msg.data field, and +you must fill in the msg.data_len field with the size of the data. +This gives the receiver a place to actually put the message. + +If the message cannot fit into the data you provide, you will get an +EMSGSIZE error and the driver will leave the data in the receive +queue. If you want to get it and have it truncate the message, us +the IPMICTL_RECEIVE_MSG_TRUNC ioctl. + +When you send a command (which is defined by the lowest-order bit of +the netfn per the IPMI spec) on the IPMB bus, the driver will +automatically assign the sequence number to the command and save the +command. If the response is not receive in the IMPI-specified 5 +seconds, it will generate a response automatically saying the command +timed out. If an unsolicited response comes in (if it was after 5 +seconds, for instance), that response will be ignored. + +In kernelland, after you receive a message and are done with it, you +MUST call ipmi_free_recv_msg() on it, or you will leak messages. Note +that you should NEVER mess with the "done" field of a message, that is +required to properly clean up the message. + +Note that when sending, there is an ipmi_request_supply_msgs() call +that lets you supply the smi and receive message. This is useful for +pieces of code that need to work even if the system is out of buffers +(the watchdog timer uses this, for instance). You supply your own +buffer and own free routines. This is not recommended for normal use, +though, since it is tricky to manage your own buffers. + + +Events and Incoming Commands + +The driver takes care of polling for IPMI events and receiving +commands (commands are messages that are not responses, they are +commands that other things on the IPMB bus have sent you). To receive +these, you must register for them, they will not automatically be sent +to you. + +To receive events, you must call ipmi_set_gets_events() and set the +"val" to non-zero. Any events that have been received by the driver +since startup will immediately be delivered to the first user that +registers for events. After that, if multiple users are registered +for events, they will all receive all events that come in. + +For receiving commands, you have to individually register commands you +want to receive. Call ipmi_register_for_cmd() and supply the netfn +and command name for each command you want to receive. Only one user +may be registered for each netfn/cmd, but different users may register +for different commands. + +From userland, equivalent IOCTLs are provided to do these functions. + + +The Lower Layer (SMI) Interface +------------------------------- + +As mentioned before, multiple SMI interfaces may be registered to the +message handler, each of these is assigned an interface number when +they register with the message handler. They are generally assigned +in the order they register, although if an SMI unregisters and then +another one registers, all bets are off. + +The ipmi_smi.h defines the interface for SMIs, see that for more +details. + + +The KCS Driver +-------------- + +The KCS driver allows up to 4 KCS interfaces to be configured in the +system. By default, the driver will register one KCS interface at the +spec-specified address 0xca2 without interrupts. You can change this +at module load time (for a module) with: + + insmod ipmi_kcs_drv.o kcs_addrs=<addr1>,<addr2>.. kcs_irqs=<irq1>,<irq2>.. + +When compiled into the kernel, the addresses can be specified on the +kernel command line as: + + ipmi_kcs=<addr1>,<irq1>,<addr2>,<irq2>.... + +If you specify zero for an address, the driver will use 0xca2. If you +specify zero for in irq, the driver will run polled. + +If you have high-res timers compiled into the kernel, the driver will +use them to provide much better performance. Note that if you do not +have high-res timers enabled in the kernel and you don't have +interrupts enabled, the driver will run VERY slowly. Don't blame me, +the KCS interface sucks. + + +Other Pieces +------------ + +Watchdog + +A watchdog timer is provided that implements the Linux-standard +watchdog timer interface. It has three module parameters that can be +used to control it: + + insmod ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type> + preaction=<preaction type> + +The timeout is the number of seconds to the action, and the pretimeout +is the amount of seconds before the reset that the pre-timeout panic will +occur (if pretimeout is zero, then pretimeout will not be enabled). + +The action may be "reset", "power_cycle", or "power_off", and +specifies what to do when the timer times out, and defaults to +"reset". + +The preaction may be "pre_smi" for an indication through the SMI +interface, "pre_int" for an indication through the SMI with an +interrupts, and "pre_nmi" for a NMI on a preaction. + +When compiled into the kernel, the kernel command line is available +for configuring the watchdog: + + ipmi_wdog=<timeout>[,<pretimeout>[,<option>[,<options>....]]] + +The options are the actions and preaction above (if an option +controlling the same thing is specified twice, the last is taken). An +options "start_now" is also there, if included, the watchdog will +start running immediately when all the drivers are ready, it doesn't +have to have a user hooked up to start it. + +The watchdog will panic and start a 120 second reset timeout if it +gets a pre-action. During a panic or a reboot, the watchdog will +start a 120 timer if it is running to make sure the reboot occurs. diff -urN linux.orig/arch/i386/kernel/Makefile linux/arch/i386/kernel/Makefile --- linux.orig/arch/i386/kernel/Makefile Mon Oct 21 13:25:58 2002 +++ linux/arch/i386/kernel/Makefile Thu Oct 24 12:48:14 2002 @@ -9,7 +9,7 @@ obj-y := process.o semaphore.o signal.o entry.o traps.o irq.o vm86.o \ ptrace.o i8259.o ioport.o ldt.o setup.o time.o sys_i386.o \ pci-dma.o i386_ksyms.o i387.o bluesmoke.o dmi_scan.o \ - bootflag.o + bootflag.o nmi.o obj-y += cpu/ obj-y += timers/ @@ -23,7 +23,7 @@ obj-$(CONFIG_ACPI_SLEEP) += acpi_wakeup.o obj-$(CONFIG_X86_SMP) += smp.o smpboot.o trampoline.o obj-$(CONFIG_X86_MPPARSE) += mpparse.o -obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o +obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi_watchdog.o obj-$(CONFIG_X86_IO_APIC) += io_apic.o obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend.o obj-$(CONFIG_X86_NUMAQ) += numaq.o diff -urN linux.orig/arch/i386/kernel/i386_ksyms.c linux/arch/i386/kernel/i386_ksyms.c --- linux.orig/arch/i386/kernel/i386_ksyms.c Mon Oct 21 13:25:58 2002 +++ linux/arch/i386/kernel/i386_ksyms.c Thu Oct 24 14:01:05 2002 @@ -90,6 +90,9 @@ EXPORT_SYMBOL(cpu_khz); EXPORT_SYMBOL(apm_info); +EXPORT_SYMBOL(request_nmi); +EXPORT_SYMBOL(release_nmi); + #ifdef CONFIG_DEBUG_IOVIRT EXPORT_SYMBOL(__io_virt_debug); #endif @@ -176,8 +179,6 @@ EXPORT_SYMBOL_GPL(register_profile_notifier); EXPORT_SYMBOL_GPL(unregister_profile_notifier); -EXPORT_SYMBOL_GPL(set_nmi_callback); -EXPORT_SYMBOL_GPL(unset_nmi_callback); #undef memcpy #undef memset diff -urN linux.orig/arch/i386/kernel/irq.c linux/arch/i386/kernel/irq.c --- linux.orig/arch/i386/kernel/irq.c Mon Oct 21 13:25:58 2002 +++ linux/arch/i386/kernel/irq.c Tue Oct 22 12:08:20 2002 @@ -131,6 +131,8 @@ * Generic, controller-independent functions: */ +extern void nmi_append_user_names(struct seq_file *p); + int show_interrupts(struct seq_file *p, void *v) { int i, j; @@ -166,6 +168,8 @@ for (j = 0; j < NR_CPUS; j++) if (cpu_online(j)) p += seq_printf(p, "%10u ", nmi_count(j)); + seq_printf(p, " "); + nmi_append_user_names(p); seq_putc(p, '\n'); #if CONFIG_X86_LOCAL_APIC seq_printf(p, "LOC: "); diff -urN linux.orig/arch/i386/kernel/nmi.c linux/arch/i386/kernel/nmi.c --- linux.orig/arch/i386/kernel/nmi.c Mon Oct 21 13:25:45 2002 +++ linux/arch/i386/kernel/nmi.c Fri Oct 25 08:21:22 2002 @@ -1,404 +1,239 @@ /* * linux/arch/i386/nmi.c * - * NMI watchdog support on APIC systems + * NMI support. * - * Started by Ingo Molnar <mingo@redhat.com> + * Corey Minyard <cminyard@mvista.com> * - * Fixes: - * Mikael Pettersson : AMD K7 support for local APIC NMI watchdog. - * Mikael Pettersson : Power Management for local APIC NMI watchdog. - * Mikael Pettersson : Pentium 4 support for local APIC NMI watchdog. + * Moved some of this over from traps.c. */ #include <linux/config.h> -#include <linux/mm.h> -#include <linux/irq.h> #include <linux/delay.h> -#include <linux/bootmem.h> -#include <linux/smp_lock.h> +#include <linux/spinlock.h> +#include <linux/list.h> +#include <linux/sched.h> +#include <linux/errno.h> +#include <linux/rcupdate.h> +#include <linux/seq_file.h> +#include <linux/notifier.h> #include <linux/interrupt.h> -#include <linux/mc146818rtc.h> -#include <linux/kernel_stat.h> -#include <asm/smp.h> -#include <asm/mtrr.h> -#include <asm/mpspec.h> - -unsigned int nmi_watchdog = NMI_NONE; -static unsigned int nmi_hz = HZ; -unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */ -extern void show_registers(struct pt_regs *regs); - -#define K7_EVNTSEL_ENABLE (1 << 22) -#define K7_EVNTSEL_INT (1 << 20) -#define K7_EVNTSEL_OS (1 << 17) -#define K7_EVNTSEL_USR (1 << 16) -#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 -#define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING - -#define P6_EVNTSEL0_ENABLE (1 << 22) -#define P6_EVNTSEL_INT (1 << 20) -#define P6_EVNTSEL_OS (1 << 17) -#define P6_EVNTSEL_USR (1 << 16) -#define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 -#define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED - -#define MSR_P4_MISC_ENABLE 0x1A0 -#define MSR_P4_MISC_ENABLE_PERF_AVAIL (1<<7) -#define MSR_P4_MISC_ENABLE_PEBS_UNAVAIL (1<<12) -#define MSR_P4_PERFCTR0 0x300 -#define MSR_P4_CCCR0 0x360 -#define P4_ESCR_EVENT_SELECT(N) ((N)<<25) -#define P4_ESCR_OS (1<<3) -#define P4_ESCR_USR (1<<2) -#define P4_CCCR_OVF_PMI (1<<26) -#define P4_CCCR_THRESHOLD(N) ((N)<<20) -#define P4_CCCR_COMPLEMENT (1<<19) -#define P4_CCCR_COMPARE (1<<18) -#define P4_CCCR_REQUIRED (3<<16) -#define P4_CCCR_ESCR_SELECT(N) ((N)<<13) -#define P4_CCCR_ENABLE (1<<12) -/* Set up IQ_COUNTER0 to behave like a clock, by having IQ_CCCR0 filter - CRU_ESCR0 (with any non-null event selector) through a complemented - max threshold. [IA32-Vol3, Section 14.9.9] */ -#define MSR_P4_IQ_COUNTER0 0x30C -#define MSR_P4_IQ_CCCR0 0x36C -#define MSR_P4_CRU_ESCR0 0x3B8 -#define P4_NMI_CRU_ESCR0 (P4_ESCR_EVENT_SELECT(0x3F)|P4_ESCR_OS|P4_ESCR_USR) -#define P4_NMI_IQ_CCCR0 \ - (P4_CCCR_OVF_PMI|P4_CCCR_THRESHOLD(15)|P4_CCCR_COMPLEMENT| \ - P4_CCCR_COMPARE|P4_CCCR_REQUIRED|P4_CCCR_ESCR_SELECT(4)|P4_CCCR_ENABLE) - -int __init check_nmi_watchdog (void) -{ - unsigned int prev_nmi_count[NR_CPUS]; - int cpu; - - printk(KERN_INFO "testing NMI watchdog ... "); - - for (cpu = 0; cpu < NR_CPUS; cpu++) - prev_nmi_count[cpu] = irq_stat[cpu].__nmi_count; - local_irq_enable(); - mdelay((10*1000)/nmi_hz); // wait 10 ticks - - /* FIXME: Only boot CPU is online at this stage. Check CPUs - as they come up. */ - for (cpu = 0; cpu < NR_CPUS; cpu++) { - if (!cpu_online(cpu)) - continue; - if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) { - printk("CPU#%d: NMI appears to be stuck!\n", cpu); - return -1; - } - } - printk("OK.\n"); +#include <asm/io.h> +#include <asm/nmi.h> - /* now that we know it works we can reduce NMI frequency to - something more reasonable; makes a difference in some configs */ - if (nmi_watchdog == NMI_LOCAL_APIC) - nmi_hz = 1; +extern void show_registers(struct pt_regs *regs); - return 0; -} +/* + * A list of handlers for NMIs. This list will be called in order + * when an NMI from an otherwise unidentifiable source comes in. If + * one of these handles the NMI, it should return NOTIFY_OK, otherwise + * it should return NOTIFY_DONE. NMI handlers cannot claim spinlocks, + * so we have to handle freeing these in a different manner. A + * spinlock protects the list from multiple writers. When something + * is removed from the list, it is thrown into another list (with + * another link, so the "next" element stays valid) and scheduled to + * run as an rcu. When the rcu runs, it is guaranteed that nothing in + * the NMI code will be using it. + */ +static struct list_head nmi_handler_list = LIST_HEAD_INIT(nmi_handler_list); +static spinlock_t nmi_handler_lock = SPIN_LOCK_UNLOCKED; -static int __init setup_nmi_watchdog(char *str) +/* + * To free the list item, we use an rcu. The rcu-function will not + * run until all processors have done a context switch, gone idle, or + * gone to a user process, so it's guaranteed that when this runs, any + * NMI handler running at release time has completed and the list item + * can be safely freed. + */ +static void free_nmi_handler(void *arg) { - int nmi; + struct nmi_handler *handler = arg; - get_option(&str, &nmi); - - if (nmi >= NMI_INVALID) - return 0; - if (nmi == NMI_NONE) - nmi_watchdog = nmi; - /* - * If any other x86 CPU has a local APIC, then - * please test the NMI stuff there and send me the - * missing bits. Right now Intel P6/P4 and AMD K7 only. - */ - if ((nmi == NMI_LOCAL_APIC) && - (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) && - (boot_cpu_data.x86 == 6 || boot_cpu_data.x86 == 15)) - nmi_watchdog = nmi; - if ((nmi == NMI_LOCAL_APIC) && - (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) && - (boot_cpu_data.x86 == 6)) - nmi_watchdog = nmi; - /* - * We can enable the IO-APIC watchdog - * unconditionally. - */ - if (nmi == NMI_IO_APIC) - nmi_watchdog = nmi; - return 1; + INIT_LIST_HEAD(&(handler->link)); + complete(&(handler->complete)); } -__setup("nmi_watchdog=", setup_nmi_watchdog); - -#ifdef CONFIG_PM +int request_nmi(struct nmi_handler *handler) +{ + struct list_head *curr; + struct nmi_handler *curr_h = NULL; -#include <linux/pm.h> + if (!list_empty(&(handler->link))) + return -EBUSY; -struct pm_dev *nmi_pmdev; + spin_lock(&nmi_handler_lock); -static void disable_apic_nmi_watchdog(void) -{ - switch (boot_cpu_data.x86_vendor) { - case X86_VENDOR_AMD: - wrmsr(MSR_K7_EVNTSEL0, 0, 0); - break; - case X86_VENDOR_INTEL: - switch (boot_cpu_data.x86) { - case 6: - wrmsr(MSR_P6_EVNTSEL0, 0, 0); + __list_for_each(curr, &nmi_handler_list) { + curr_h = list_entry(curr, struct nmi_handler, link); + if (curr_h->priority <= handler->priority) break; - case 15: - wrmsr(MSR_P4_IQ_CCCR0, 0, 0); - wrmsr(MSR_P4_CRU_ESCR0, 0, 0); - break; - } - break; } -} -static int nmi_pm_callback(struct pm_dev *dev, pm_request_t rqst, void *data) -{ - switch (rqst) { - case PM_SUSPEND: - disable_apic_nmi_watchdog(); - break; - case PM_RESUME: - setup_apic_nmi_watchdog(); - break; - } + /* list_add_rcu takes care of memory barrier */ + if (curr_h) + if (curr_h->priority <= handler->priority) + list_add_rcu(&(handler->link), curr_h->link.prev); + else + list_add_rcu(&(handler->link), &(curr_h->link)); + else + list_add_rcu(&(handler->link), &nmi_handler_list); + + spin_unlock(&nmi_handler_lock); return 0; } -struct pm_dev * set_nmi_pm_callback(pm_callback callback) +void release_nmi(struct nmi_handler *handler) { - apic_pm_unregister(nmi_pmdev); - return apic_pm_register(PM_SYS_DEV, 0, callback); -} + spin_lock(&nmi_handler_lock); + list_del_rcu(&(handler->link)); + init_completion(&(handler->complete)); + call_rcu(&(handler->rcu), free_nmi_handler, handler); + spin_unlock(&nmi_handler_lock); -void unset_nmi_pm_callback(struct pm_dev * dev) -{ - apic_pm_unregister(dev); - nmi_pmdev = apic_pm_register(PM_SYS_DEV, 0, nmi_pm_callback); -} - -static void nmi_pm_init(void) -{ - if (!nmi_pmdev) - nmi_pmdev = apic_pm_register(PM_SYS_DEV, 0, nmi_pm_callback); + /* Wait for handler to finish being freed. This can't be + interrupted, we must wait until it finished. */ + wait_for_completion(&(handler->complete)); } -#define __pminit /*empty*/ - -#else /* CONFIG_PM */ - -static inline void nmi_pm_init(void) { } - -#define __pminit __init - -#endif /* CONFIG_PM */ - -/* - * Activate the NMI watchdog via the local APIC. - * Original code written by Keith Owens. - */ - -static void __pminit clear_msr_range(unsigned int base, unsigned int n) +void nmi_append_user_names(struct seq_file *p) { - unsigned int i; + struct list_head *curr; + struct nmi_handler *curr_h; - for(i = 0; i < n; ++i) - wrmsr(base+i, 0, 0); + spin_lock(&nmi_handler_lock); + __list_for_each(curr, &nmi_handler_list) { + curr_h = list_entry(curr, struct nmi_handler, link); + if (curr_h->dev_name) + p += seq_printf(p, " %s", curr_h->dev_name); + } + spin_unlock(&nmi_handler_lock); } -static void __pminit setup_k7_watchdog(void) +static void mem_parity_error(unsigned char reason, struct pt_regs * regs) { - unsigned int evntsel; - - nmi_perfctr_msr = MSR_K7_PERFCTR0; - - clear_msr_range(MSR_K7_EVNTSEL0, 4); - clear_msr_range(MSR_K7_PERFCTR0, 4); + printk("Uhhuh. NMI received. Dazed and confused, but trying to continue\n"); + printk("You probably have a hardware problem with your RAM chips\n"); - evntsel = K7_EVNTSEL_INT - | K7_EVNTSEL_OS - | K7_EVNTSEL_USR - | K7_NMI_EVENT; - - wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); - Dprintk("setting K7_PERFCTR0 to %08lx\n", -(cpu_khz/nmi_hz*1000)); - wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/nmi_hz*1000), -1); - apic_write(APIC_LVTPC, APIC_DM_NMI); - evntsel |= K7_EVNTSEL_ENABLE; - wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); + /* Clear and disable the memory parity error line. */ + reason = (reason & 0xf) | 4; + outb(reason, 0x61); } -static void __pminit setup_p6_watchdog(void) +static void io_check_error(unsigned char reason, struct pt_regs * regs) { - unsigned int evntsel; - - nmi_perfctr_msr = MSR_P6_PERFCTR0; - - clear_msr_range(MSR_P6_EVNTSEL0, 2); - clear_msr_range(MSR_P6_PERFCTR0, 2); - - evntsel = P6_EVNTSEL_INT - | P6_EVNTSEL_OS - | P6_EVNTSEL_USR - | P6_NMI_EVENT; - - wrmsr(MSR_P6_EVNTSEL0, evntsel, 0); - Dprintk("setting P6_PERFCTR0 to %08lx\n", -(cpu_khz/nmi_hz*1000)); - wrmsr(MSR_P6_PERFCTR0, -(cpu_khz/nmi_hz*1000), 0); - apic_write(APIC_LVTPC, APIC_DM_NMI); - evntsel |= P6_EVNTSEL0_ENABLE; - wrmsr(MSR_P6_EVNTSEL0, evntsel, 0); -} + unsigned long i; -static int __pminit setup_p4_watchdog(void) -{ - unsigned int misc_enable, dummy; + printk("NMI: IOCK error (debug interrupt?)\n"); + show_registers(regs); - rdmsr(MSR_P4_MISC_ENABLE, misc_enable, dummy); - if (!(misc_enable & MSR_P4_MISC_ENABLE_PERF_AVAIL)) - return 0; - - nmi_perfctr_msr = MSR_P4_IQ_COUNTER0; - - if (!(misc_enable & MSR_P4_MISC_ENABLE_PEBS_UNAVAIL)) - clear_msr_range(0x3F1, 2); - /* MSR 0x3F0 seems to have a default value of 0xFC00, but current - docs doesn't fully define it, so leave it alone for now. */ - clear_msr_range(0x3A0, 31); - clear_msr_range(0x3C0, 6); - clear_msr_range(0x3C8, 6); - clear_msr_range(0x3E0, 2); - clear_msr_range(MSR_P4_CCCR0, 18); - clear_msr_range(MSR_P4_PERFCTR0, 18); - - wrmsr(MSR_P4_CRU_ESCR0, P4_NMI_CRU_ESCR0, 0); - wrmsr(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0 & ~P4_CCCR_ENABLE, 0); - Dprintk("setting P4_IQ_COUNTER0 to 0x%08lx\n", -(cpu_khz/nmi_hz*1000)); - wrmsr(MSR_P4_IQ_COUNTER0, -(cpu_khz/nmi_hz*1000), -1); - apic_write(APIC_LVTPC, APIC_DM_NMI); - wrmsr(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0, 0); - return 1; + /* Re-enable the IOCK line, wait for a few seconds */ + reason = (reason & 0xf) | 8; + outb(reason, 0x61); + i = 2000; + while (--i) udelay(1000); + reason &= ~8; + outb(reason, 0x61); } -void __pminit setup_apic_nmi_watchdog (void) +static void unknown_nmi_error(struct pt_regs * regs, int cpu) { - switch (boot_cpu_data.x86_vendor) { - case X86_VENDOR_AMD: - if (boot_cpu_data.x86 != 6) - return; - setup_k7_watchdog(); - break; - case X86_VENDOR_INTEL: - switch (boot_cpu_data.x86) { - case 6: - setup_p6_watchdog(); - break; - case 15: - if (!setup_p4_watchdog()) - return; - break; - default: - return; - } - break; - default: +#ifdef CONFIG_MCA + /* Might actually be able to figure out what the guilty party + * is. */ + if( MCA_bus ) { + mca_handle_nmi(); return; } - nmi_pm_init(); +#endif + printk("Uhhuh. Received NMI for unknown reason on CPU %d.\n", cpu); + printk("Dazed and confused, but trying to continue\n"); + printk("Do you have a strange power saving mode enabled?\n"); } -static spinlock_t nmi_print_lock = SPIN_LOCK_UNLOCKED; +/* Check "normal" sources of NMI. */ +static int nmi_std (void * dev_id, struct pt_regs * regs, int cpu, int handled) +{ + unsigned char reason; -/* - * the best way to detect whether a CPU has a 'hard lockup' problem - * is to check it's local APIC timer IRQ counts. If they are not - * changing then that CPU has some problem. - * - * as these watchdog NMI IRQs are generated on every CPU, we only - * have to check the current processor. - * - * since NMIs dont listen to _any_ locks, we have to be extremely - * careful not to rely on unsafe variables. The printk might lock - * up though, so we have to break up any console locks first ... - * [when there will be more tty-related locks, break them up - * here too!] - */ + reason = inb(0x61); + if (reason & 0xc0) { + if (reason & 0x80) + mem_parity_error(reason, regs); + if (reason & 0x40) + io_check_error(reason, regs); + return NOTIFY_OK; + } -static unsigned int - last_irq_sums [NR_CPUS], - alert_counter [NR_CPUS]; - -void touch_nmi_watchdog (void) -{ - int i; - - /* - * Just reset the alert counters, (other CPUs might be - * spinning on locks we hold): - */ - for (i = 0; i < NR_CPUS; i++) - alert_counter[i] = 0; + return NOTIFY_DONE; } -void nmi_watchdog_tick (struct pt_regs * regs) +static struct nmi_handler nmi_std_handler = +{ + .link = LIST_HEAD_INIT(nmi_std_handler.link), + .dev_name = "nmi_std", + .dev_id = NULL, + .handler = nmi_std, + .priority = 128, /* mid-level priority. */ +}; + +asmlinkage void do_nmi(struct pt_regs * regs, long error_code) { + struct list_head *curr; + struct nmi_handler *curr_h; + int val; + int cpu = smp_processor_id(); + int handled = 0; + - /* - * Since current_thread_info()-> is always on the stack, and we - * always switch the stack NMI-atomically, it's safe to use - * smp_processor_id(). - */ - int sum, cpu = smp_processor_id(); + ++nmi_count(cpu); - sum = irq_stat[cpu].apic_timer_irqs; + /* Since NMIs are edge-triggered, we could possibly miss one if we + don't call them all, so we call them all. */ - if (last_irq_sums[cpu] == sum) { - /* - * Ayiee, looks like this CPU is stuck ... - * wait a few IRQs (5 seconds) before doing the oops ... - */ - alert_counter[cpu]++; - if (alert_counter[cpu] == 5*nmi_hz) { - spin_lock(&nmi_print_lock); - /* - * We are in trouble anyway, lets at least try - * to get a message out. - */ - bust_spinlocks(1); - printk("NMI Watchdog detected LOCKUP on CPU%d, eip %08lx, registers:\n", cpu, regs->eip); - show_registers(regs); - printk("console shuts up ...\n"); - console_silent(); - spin_unlock(&nmi_print_lock); - bust_spinlocks(0); - do_exit(SIGSEGV); + __list_for_each_rcu(curr, &nmi_handler_list) { + curr_h = list_entry(curr, struct nmi_handler, link); + val = curr_h->handler(curr_h->dev_id, regs, cpu, handled); + switch (val & ~NOTIFY_STOP_MASK) { + case NOTIFY_OK: + handled = 1; + break; + + case NOTIFY_DONE: + default:; } - } else { - last_irq_sums[cpu] = sum; - alert_counter[cpu] = 0; + if (val & NOTIFY_STOP_MASK) + break; } - if (nmi_perfctr_msr) { - if (nmi_perfctr_msr == MSR_P4_IQ_COUNTER0) { - /* - * P4 quirks: - * - An overflown perfctr will assert its interrupt - * until the OVF flag in its CCCR is cleared. - * - LVTPC is masked on interrupt and must be - * unmasked by the LVTPC handler. - */ - wrmsr(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0, 0); - apic_write(APIC_LVTPC, APIC_DM_NMI); - } - wrmsr(nmi_perfctr_msr, -(cpu_khz/nmi_hz*1000), -1); + + if (!handled) + unknown_nmi_error(regs, cpu); + else { + /* + * Reassert NMI in case it became active meanwhile + * as it's edge-triggered. Don't do this if the NMI + * wasn't handled to avoid an infinite NMI loop. + * + * This is necessary in case we have another external + * NMI while processing this one. The external NMIs + * are level-generated, into the processor NMIs are + * edge-triggered, so if you have one NMI source + * come in while another is already there, the level + * will never go down to cause another edge, and + * no more NMIs will happen. This does NOT apply + * to internally generated NMIs, though, so you + * can't use the same trick to only call one handler + * at a time. Otherwise, if two internal NMIs came + * in at the same time you might miss one. + */ + outb(0x8f, 0x70); + inb(0x71); /* dummy */ + outb(0x0f, 0x70); + inb(0x71); /* dummy */ } +} + +void __init init_nmi(void) +{ + request_nmi(&nmi_std_handler); } diff -urN linux.orig/arch/i386/kernel/nmi_watchdog.c linux/arch/i386/kernel/nmi_watchdog.c --- linux.orig/arch/i386/kernel/nmi_watchdog.c Thu Oct 24 19:56:54 2002 +++ linux/arch/i386/kernel/nmi_watchdog.c Thu Oct 24 20:54:19 2002 @@ -0,0 +1,481 @@ +/* + * linux/arch/i386/nmi_watchdog.c + * + * NMI watchdog support on APIC systems + * + * Started by Ingo Molnar <mingo@redhat.com> + * + * Fixes: + * Mikael Pettersson : AMD K7 support for local APIC NMI watchdog. + * Mikael Pettersson : Power Management for local APIC NMI watchdog. + * Mikael Pettersson : Pentium 4 support for local APIC NMI watchdog. + */ + +#include <linux/config.h> +#include <linux/mm.h> +#include <linux/irq.h> +#include <linux/delay.h> +#include <linux/bootmem.h> +#include <linux/smp_lock.h> +#include <linux/interrupt.h> +#include <linux/mc146818rtc.h> +#include <linux/kernel_stat.h> +#include <linux/notifier.h> + +#include <asm/smp.h> +#include <asm/mtrr.h> +#include <asm/mpspec.h> +#include <asm/nmi.h> + +unsigned int nmi_watchdog = NMI_NONE; +static unsigned int nmi_hz = HZ; + +/* This is for I/O APIC, until we can figure out how to tell if it's from the + I/O APIC. If the NMI was not handled before now, we handle it. */ +static int dummy_watchdog_reset(int handled) +{ + return !handled; +} + +/* + * Returns 1 if it is a source of the NMI, and resets the NMI to go + * off again. + */ +static int (*watchdog_reset)(int handled) = dummy_watchdog_reset; + +extern void show_registers(struct pt_regs *regs); + +#define K7_EVNTSEL_ENABLE (1 << 22) +#define K7_EVNTSEL_INT (1 << 20) +#define K7_EVNTSEL_OS (1 << 17) +#define K7_EVNTSEL_USR (1 << 16) +#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76 +#define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING + +#define P6_EVNTSEL0_ENABLE (1 << 22) +#define P6_EVNTSEL_INT (1 << 20) +#define P6_EVNTSEL_OS (1 << 17) +#define P6_EVNTSEL_USR (1 << 16) +#define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79 +#define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED + +#define MSR_P4_MISC_ENABLE 0x1A0 +#define MSR_P4_MISC_ENABLE_PERF_AVAIL (1<<7) +#define MSR_P4_MISC_ENABLE_PEBS_UNAVAIL (1<<12) +#define MSR_P4_PERFCTR0 0x300 +#define MSR_P4_CCCR0 0x360 +#define P4_ESCR_EVENT_SELECT(N) ((N)<<25) +#define P4_ESCR_OS (1<<3) +#define P4_ESCR_USR (1<<2) +#define P4_CCCR_OVF_PMI (1<<26) +#define P4_CCCR_THRESHOLD(N) ((N)<<20) +#define P4_CCCR_COMPLEMENT (1<<19) +#define P4_CCCR_COMPARE (1<<18) +#define P4_CCCR_REQUIRED (3<<16) +#define P4_CCCR_ESCR_SELECT(N) ((N)<<13) +#define P4_CCCR_ENABLE (1<<12) +/* Set up IQ_COUNTER0 to behave like a clock, by having IQ_CCCR0 filter + CRU_ESCR0 (with any non-null event selector) through a complemented + max threshold. [IA32-Vol3, Section 14.9.9] */ +#define MSR_P4_IQ_COUNTER0 0x30C +#define MSR_P4_IQ_CCCR0 0x36C +#define MSR_P4_CRU_ESCR0 0x3B8 +#define P4_NMI_CRU_ESCR0 (P4_ESCR_EVENT_SELECT(0x3F)|P4_ESCR_OS|P4_ESCR_USR) +#define P4_NMI_IQ_CCCR0 \ + (P4_CCCR_OVF_PMI|P4_CCCR_THRESHOLD(15)|P4_CCCR_COMPLEMENT| \ + P4_CCCR_COMPARE|P4_CCCR_REQUIRED|P4_CCCR_ESCR_SELECT(4)|P4_CCCR_ENABLE) + +int __init check_nmi_watchdog (void) +{ + unsigned int prev_nmi_count[NR_CPUS]; + int cpu; + + printk(KERN_INFO "testing NMI watchdog ... "); + + for (cpu = 0; cpu < NR_CPUS; cpu++) + prev_nmi_count[cpu] = irq_stat[cpu].__nmi_count; + local_irq_enable(); + mdelay((10*1000)/nmi_hz); // wait 10 ticks + + /* FIXME: Only boot CPU is online at this stage. Check CPUs + as they come up. */ + for (cpu = 0; cpu < NR_CPUS; cpu++) { + if (!cpu_online(cpu)) + continue; + if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) { + printk("CPU#%d: NMI appears to be stuck!\n", cpu); + return -1; + } + } + printk("OK.\n"); + + /* now that we know it works we can reduce NMI frequency to + something more reasonable; makes a difference in some configs */ + if (nmi_watchdog == NMI_LOCAL_APIC) + nmi_hz = 1; + + return 0; +} + +static int nmi_watchdog_tick (void * dev_id, struct pt_regs * regs, int cpu, + int handled); + +static struct nmi_handler nmi_watchdog_handler = +{ + .link = LIST_HEAD_INIT(nmi_watchdog_handler.link), + .dev_name = "nmi_watchdog", + .dev_id = NULL, + .handler = nmi_watchdog_tick, + .priority = 255, /* We want to be relatively high priority. */ +}; + +static int __init setup_nmi_watchdog(char *str) +{ + int nmi; + + get_option(&str, &nmi); + + if (nmi >= NMI_INVALID) + return 0; + + if (nmi == NMI_NONE) + nmi_watchdog = nmi; + /* + * If any other x86 CPU has a local APIC, then + * please test the NMI stuff there and send me the + * missing bits. Right now Intel P6/P4 and AMD K7 only. + */ + if ((nmi == NMI_LOCAL_APIC) && + (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) && + (boot_cpu_data.x86 == 6 || boot_cpu_data.x86 == 15)) + nmi_watchdog = nmi; + if ((nmi == NMI_LOCAL_APIC) && + (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) && + (boot_cpu_data.x86 == 6)) + nmi_watchdog = nmi; + /* + * We can enable the IO-APIC watchdog + * unconditionally. + */ + if (nmi == NMI_IO_APIC) + nmi_watchdog = nmi; + + if (nmi_watchdog != NMI_NONE) { + if (request_nmi(&nmi_watchdog_handler) != 0) { + /* Couldn't add a watchdog handler, give up. */ + printk(KERN_WARNING + "nmi_watchdog: Couldn't request nmi\n"); + nmi_watchdog = NMI_NONE; + return 0; + } + } + + return 1; +} + +__setup("nmi_watchdog=", setup_nmi_watchdog); + +#ifdef CONFIG_PM + +#include <linux/pm.h> + +struct pm_dev *nmi_pmdev; + +static void disable_apic_nmi_watchdog(void) +{ + switch (boot_cpu_data.x86_vendor) { + case X86_VENDOR_AMD: + wrmsr(MSR_K7_EVNTSEL0, 0, 0); + break; + case X86_VENDOR_INTEL: + switch (boot_cpu_data.x86) { + case 6: + wrmsr(MSR_P6_EVNTSEL0, 0, 0); + break; + case 15: + wrmsr(MSR_P4_IQ_CCCR0, 0, 0); + wrmsr(MSR_P4_CRU_ESCR0, 0, 0); + break; + } + break; + } +} + +static int nmi_pm_callback(struct pm_dev *dev, pm_request_t rqst, void *data) +{ + switch (rqst) { + case PM_SUSPEND: + disable_apic_nmi_watchdog(); + break; + case PM_RESUME: + setup_apic_nmi_watchdog(); + break; + } + return 0; +} + +struct pm_dev * set_nmi_pm_callback(pm_callback callback) +{ + apic_pm_unregister(nmi_pmdev); + return apic_pm_register(PM_SYS_DEV, 0, callback); +} + +void unset_nmi_pm_callback(struct pm_dev * dev) +{ + apic_pm_unregister(dev); + nmi_pmdev = apic_pm_register(PM_SYS_DEV, 0, nmi_pm_callback); +} + +static void nmi_pm_init(void) +{ + if (!nmi_pmdev) + nmi_pmdev = apic_pm_register(PM_SYS_DEV, 0, nmi_pm_callback); +} + +#define __pminit /*empty*/ + +#else /* CONFIG_PM */ + +static inline void nmi_pm_init(void) { } + +#define __pminit __init + +#endif /* CONFIG_PM */ + +/* + * Activate the NMI watchdog via the local APIC. + * Original code written by Keith Owens. + */ + +static void __pminit clear_msr_range(unsigned int base, unsigned int n) +{ + unsigned int i; + + for(i = 0; i < n; ++i) + wrmsr(base+i, 0, 0); +} + +static int k7_watchdog_reset(int handled) +{ + unsigned int low, high; + int source; + + rdmsr(MSR_K7_PERFCTR0, low, high); + source = (low & (1 << 31)) == 0; + if (source) + wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/nmi_hz*1000), -1); + return source; +} + +static void __pminit setup_k7_watchdog(void) +{ + unsigned int evntsel; + + watchdog_reset = k7_watchdog_reset; + + clear_msr_range(MSR_K7_EVNTSEL0, 4); + clear_msr_range(MSR_K7_PERFCTR0, 4); + + evntsel = K7_EVNTSEL_INT + | K7_EVNTSEL_OS + | K7_EVNTSEL_USR + | K7_NMI_EVENT; + + wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); + Dprintk("setting K7_PERFCTR0 to %08lx\n", -(cpu_khz/nmi_hz*1000)); + wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/nmi_hz*1000), -1); + apic_write(APIC_LVTPC, APIC_DM_NMI); + evntsel |= K7_EVNTSEL_ENABLE; + wrmsr(MSR_K7_EVNTSEL0, evntsel, 0); +} + +static int p6_watchdog_reset(int handled) +{ + unsigned int low, high; + int source; + + rdmsr(MSR_P6_PERFCTR0, low, high); + source = (low & (1 << 31)) == 0; + if (source) + wrmsr(MSR_P6_PERFCTR0, -(cpu_khz/nmi_hz*1000), -1); + return source; +} + +static void __pminit setup_p6_watchdog(void) +{ + unsigned int evntsel; + + watchdog_reset = p6_watchdog_reset; + + clear_msr_range(MSR_P6_EVNTSEL0, 2); + clear_msr_range(MSR_P6_PERFCTR0, 2); + + evntsel = P6_EVNTSEL_INT + | P6_EVNTSEL_OS + | P6_EVNTSEL_USR + | P6_NMI_EVENT; + + wrmsr(MSR_P6_EVNTSEL0, evntsel, 0); + Dprintk("setting P6_PERFCTR0 to %08lx\n", -(cpu_khz/nmi_hz*1000)); + wrmsr(MSR_P6_PERFCTR0, -(cpu_khz/nmi_hz*1000), 0); + apic_write(APIC_LVTPC, APIC_DM_NMI); + evntsel |= P6_EVNTSEL0_ENABLE; + wrmsr(MSR_P6_EVNTSEL0, evntsel, 0); +} + +static int p4_watchdog_reset(int handled) +{ + unsigned int low, high; + int source; + + rdmsr(MSR_P4_IQ_COUNTER0, low, high); + source = (low & (1 << 31)) == 0; + if (source) { + /* + * P4 quirks: + * - An overflown perfctr will assert its interrupt + * until the OVF flag in its CCCR is cleared. + * - LVTPC is masked on interrupt and must be + * unmasked by the LVTPC handler. + */ + wrmsr(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0, 0); + apic_write(APIC_LVTPC, APIC_DM_NMI); + + wrmsr(MSR_P4_IQ_COUNTER0, -(cpu_khz/nmi_hz*1000), -1); + } + return source; +} + +static int __pminit setup_p4_watchdog(void) +{ + unsigned int misc_enable, dummy; + + rdmsr(MSR_P4_MISC_ENABLE, misc_enable, dummy); + if (!(misc_enable & MSR_P4_MISC_ENABLE_PERF_AVAIL)) + return 0; + + watchdog_reset = p4_watchdog_reset; + + if (!(misc_enable & MSR_P4_MISC_ENABLE_PEBS_UNAVAIL)) + clear_msr_range(0x3F1, 2); + /* MSR 0x3F0 seems to have a default value of 0xFC00, but current + docs doesn't fully define it, so leave it alone for now. */ + clear_msr_range(0x3A0, 31); + clear_msr_range(0x3C0, 6); + clear_msr_range(0x3C8, 6); + clear_msr_range(0x3E0, 2); + clear_msr_range(MSR_P4_CCCR0, 18); + clear_msr_range(MSR_P4_PERFCTR0, 18); + + wrmsr(MSR_P4_CRU_ESCR0, P4_NMI_CRU_ESCR0, 0); + wrmsr(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0 & ~P4_CCCR_ENABLE, 0); + Dprintk("setting P4_IQ_COUNTER0 to 0x%08lx\n", -(cpu_khz/nmi_hz*1000)); + wrmsr(MSR_P4_IQ_COUNTER0, -(cpu_khz/nmi_hz*1000), -1); + apic_write(APIC_LVTPC, APIC_DM_NMI); + wrmsr(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0, 0); + return 1; +} + +void __pminit setup_apic_nmi_watchdog (void) +{ + switch (boot_cpu_data.x86_vendor) { + case X86_VENDOR_AMD: + if (boot_cpu_data.x86 != 6) + return; + setup_k7_watchdog(); + break; + case X86_VENDOR_INTEL: + switch (boot_cpu_data.x86) { + case 6: + setup_p6_watchdog(); + break; + case 15: + if (!setup_p4_watchdog()) + return; + break; + default: + return; + } + break; + default: + return; + } + nmi_pm_init(); +} + +static spinlock_t nmi_print_lock = SPIN_LOCK_UNLOCKED; + +/* + * the best way to detect whether a CPU has a 'hard lockup' problem + * is to check it's local APIC timer IRQ counts. If they are not + * changing then that CPU has some problem. + * + * as these watchdog NMI IRQs are generated on every CPU, we only + * have to check the current processor. + * + * since NMIs dont listen to _any_ locks, we have to be extremely + * careful not to rely on unsafe variables. The printk might lock + * up though, so we have to break up any console locks first ... + * [when there will be more tty-related locks, break them up + * here too!] + */ + +static unsigned int + last_irq_sums [NR_CPUS], + alert_counter [NR_CPUS]; + +void touch_nmi_watchdog (void) +{ + int i; + + /* + * Just reset the alert counters, (other CPUs might be + * spinning on locks we hold): + */ + for (i = 0; i < NR_CPUS; i++) + alert_counter[i] = 0; +} + +static int nmi_watchdog_tick (void * dev_id, struct pt_regs * regs, int cpu, + int handled) +{ + /* + * Since current_thread_info()-> is always on the stack, and we + * always switch the stack NMI-atomically, it's safe to use + * smp_processor_id(). + */ + int sum; + + if (! watchdog_reset(handled)) + return NOTIFY_DONE; /* We are not an NMI source. */ + + sum = irq_stat[cpu].apic_timer_irqs; + + if (last_irq_sums[cpu] == sum) { + /* + * Ayiee, looks like this CPU is stuck ... + * wait a few IRQs (5 seconds) before doing the oops ... + */ + alert_counter[cpu]++; + if (alert_counter[cpu] == 5*nmi_hz) { + spin_lock(&nmi_print_lock); + /* + * We are in trouble anyway, lets at least try + * to get a message out. + */ + bust_spinlocks(1); + printk("NMI Watchdog detected LOCKUP on CPU%d, eip %08lx, registers:\n", cpu, regs->eip); + show_registers(regs); + printk("console shuts up ...\n"); + console_silent(); + spin_unlock(&nmi_print_lock); + bust_spinlocks(0); + do_exit(SIGSEGV); + } + } else { + last_irq_sums[cpu] = sum; + alert_counter[cpu] = 0; + } + + return NOTIFY_OK; +} diff -urN linux.orig/arch/i386/kernel/traps.c linux/arch/i386/kernel/traps.c --- linux.orig/arch/i386/kernel/traps.c Mon Oct 21 13:25:45 2002 +++ linux/arch/i386/kernel/traps.c Thu Oct 24 19:54:16 2002 @@ -40,7 +40,6 @@ #include <asm/debugreg.h> #include <asm/desc.h> #include <asm/i387.h> -#include <asm/nmi.h> #include <asm/smp.h> #include <asm/pgalloc.h> @@ -52,6 +51,7 @@ asmlinkage int system_call(void); asmlinkage void lcall7(void); asmlinkage void lcall27(void); +void init_nmi(void); struct desc_struct default_ldt[] = { { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 } }; @@ -443,107 +443,6 @@ } } -static void mem_parity_error(unsigned char reason, struct pt_regs * regs) -{ - printk("Uhhuh. NMI received. Dazed and confused, but trying to continue\n"); - printk("You probably have a hardware problem with your RAM chips\n"); - - /* Clear and disable the memory parity error line. */ - reason = (reason & 0xf) | 4; - outb(reason, 0x61); -} - -static void io_check_error(unsigned char reason, struct pt_regs * regs) -{ - unsigned long i; - - printk("NMI: IOCK error (debug interrupt?)\n"); - show_registers(regs); - - /* Re-enable the IOCK line, wait for a few seconds */ - reason = (reason & 0xf) | 8; - outb(reason, 0x61); - i = 2000; - while (--i) udelay(1000); - reason &= ~8; - outb(reason, 0x61); -} - -static void unknown_nmi_error(unsigned char reason, struct pt_regs * regs) -{ -#ifdef CONFIG_MCA - /* Might actually be able to figure out what the guilty party - * is. */ - if( MCA_bus ) { - mca_handle_nmi(); - return; - } -#endif - printk("Uhhuh. NMI received for unknown reason %02x on CPU %d.\n", - reason, smp_processor_id()); - printk("Dazed and confused, but trying to continue\n"); - printk("Do you have a strange power saving mode enabled?\n"); -} - -static void default_do_nmi(struct pt_regs * regs) -{ - unsigned char reason = inb(0x61); - - if (!(reason & 0xc0)) { -#if CONFIG_X86_LOCAL_APIC - /* - * Ok, so this is none of the documented NMI sources, - * so it must be the NMI watchdog. - */ - if (nmi_watchdog) { - nmi_watchdog_tick(regs); - return; - } -#endif - unknown_nmi_error(reason, regs); - return; - } - if (reason & 0x80) - mem_parity_error(reason, regs); - if (reason & 0x40) - io_check_error(reason, regs); - /* - * Reassert NMI in case it became active meanwhile - * as it's edge-triggered. - */ - outb(0x8f, 0x70); - inb(0x71); /* dummy */ - outb(0x0f, 0x70); - inb(0x71); /* dummy */ -} - -static int dummy_nmi_callback(struct pt_regs * regs, int cpu) -{ - return 0; -} - -static nmi_callback_t nmi_callback = dummy_nmi_callback; - -asmlinkage void do_nmi(struct pt_regs * regs, long error_code) -{ - int cpu = smp_processor_id(); - - ++nmi_count(cpu); - - if (!nmi_callback(regs, cpu)) - default_do_nmi(regs); -} - -void set_nmi_callback(nmi_callback_t callback) -{ - nmi_callback = callback; -} - -void unset_nmi_callback(void) -{ - nmi_callback = dummy_nmi_callback; -} - /* * Our handling of the processor debug registers is non-trivial. * We do not clear them on entry and exit from the kernel. Therefore @@ -924,4 +823,6 @@ cpu_init(); trap_init_hook(); + + init_nmi(); } diff -urN linux.orig/arch/i386/oprofile/nmi_int.c linux/arch/i386/oprofile/nmi_int.c --- linux.orig/arch/i386/oprofile/nmi_int.c Mon Oct 21 13:25:45 2002 +++ linux/arch/i386/oprofile/nmi_int.c Thu Oct 24 16:03:31 2002 @@ -54,12 +54,24 @@ // FIXME: kernel_only -static int nmi_callback(struct pt_regs * regs, int cpu) +static int nmi_callback(void *dev_id, struct pt_regs *regs, int cpu, int handled) { - return (model->check_ctrs(cpu, &cpu_msrs[cpu], regs)); + if (model->check_ctrs(cpu, &cpu_msrs[cpu], regs)) + return NOTIFY_OK; + + return NOTIFY_DONE; } - +static struct nmi_handler nmi_handler = +{ + .link = LIST_HEAD_INIT(nmi_handler.link), + .dev_name = "oprofile", + .dev_id = NULL, + .handler = nmi_callback, + .priority = 1023, /* Very high priority. */ +}; + + static void nmi_save_registers(struct op_msrs * msrs) { unsigned int const nr_ctrs = model->num_counters; @@ -96,8 +108,12 @@ } +static void nmi_cpu_shutdown(void * dummy); + static int nmi_setup(void) { + int rv; + /* We walk a thin line between law and rape here. * We need to be careful to install our NMI handler * without actually triggering any NMIs as this will @@ -105,7 +121,13 @@ */ smp_call_function(nmi_cpu_setup, NULL, 0, 1); nmi_cpu_setup(0); - set_nmi_callback(nmi_callback); + rv = request_nmi(&nmi_handler); + if (rv) { + smp_call_function(nmi_cpu_shutdown, NULL, 0, 1); + nmi_cpu_shutdown(0); + return rv; + } + oprofile_pmdev = set_nmi_pm_callback(oprofile_pm_callback); return 0; } @@ -145,7 +167,7 @@ static void nmi_shutdown(void) { unset_nmi_pm_callback(oprofile_pmdev); - unset_nmi_callback(); + release_nmi(&nmi_handler); smp_call_function(nmi_cpu_shutdown, NULL, 0, 1); nmi_cpu_shutdown(0); } diff -urN linux.orig/drivers/char/Config.help linux/drivers/char/Config.help --- linux.orig/drivers/char/Config.help Mon Oct 21 13:26:00 2002 +++ linux/drivers/char/Config.help Wed Oct 30 10:09:07 2002 @@ -946,6 +946,26 @@ If compiled as a module, it will be called scx200_gpio.o. +CONFIG_IPMI_HANDLER + This enables the central IMPI message handler, required for IPMI + to work. Note that you must have this enabled to do any other IPMI + things. See IPMI.txt for more details. + +CONFIG_IPMI_PANIC_EVENT + When a panic occurs, this will cause the IPMI message handler to + generate an IPMI event describing the panic to each interface + registered with the message handler. + +CONFIG_IPMI_DEVICE_INTERFACE + This provides an IOCTL interface to the IPMI message handler so + userland processes may use IPMI. It supports poll() and select(). + +CONFIG_IPMI_KCS + Provides a driver for a KCS-style interface to a BMC. + +CONFIG_IPMI_WATCHDOG + This enables the IPMI watchdog timer. + Texas Instruments parallel link cable support CONFIG_TIPAR If you own a Texas Instruments graphing calculator and use a @@ -966,4 +986,4 @@ Instruments graphing calculator is, then you probably don't need this driver. - If unsure, say N. \ No newline at end of file + If unsure, say N. diff -urN linux.orig/drivers/char/Config.in linux/drivers/char/Config.in --- linux.orig/drivers/char/Config.in Mon Oct 21 13:25:47 2002 +++ linux/drivers/char/Config.in Wed Oct 30 10:07:53 2002 @@ -105,6 +105,12 @@ fi fi +tristate 'IPMI top-level message handler' CONFIG_IPMI_HANDLER +dep_mbool ' Generate a panic event to all BMCs on a panic' CONFIG_IPMI_PANIC_EVENT $CONFIG_IPMI_HANDLER +dep_tristate ' Device interface for IPMI' CONFIG_IPMI_DEVICE_INTERFACE $CONFIG_IPMI_HANDLER +dep_tristate ' IPMI KCS handler' CONFIG_IPMI_KCS $CONFIG_IPMI_HANDLER +dep_tristate ' IPMI Watchdog Timer' CONFIG_IPMI_WATCHDOG $CONFIG_IPMI_HANDLER + mainmenu_option next_comment comment 'Watchdog Cards' bool 'Watchdog Timer Support' CONFIG_WATCHDOG diff -urN linux.orig/drivers/char/Makefile linux/drivers/char/Makefile --- linux.orig/drivers/char/Makefile Mon Oct 21 13:26:00 2002 +++ linux/drivers/char/Makefile Mon Oct 21 13:27:57 2002 @@ -103,6 +103,7 @@ obj-$(CONFIG_AGP) += agp/ obj-$(CONFIG_DRM) += drm/ obj-$(CONFIG_PCMCIA) += pcmcia/ +obj-$(CONFIG_IPMI_HANDLER) += ipmi/ # Files generated that shall be removed upon make clean clean-files := consolemap_deftbl.c defkeymap.c qtronixmap.c diff -urN linux.orig/drivers/char/ipmi/Makefile linux/drivers/char/ipmi/Makefile --- linux.orig/drivers/char/ipmi/Makefile Wed Dec 31 18:00:00 1969 +++ linux/drivers/char/ipmi/Makefile Sun Oct 13 16:46:24 2002 @@ -0,0 +1,17 @@ +# +# Makefile for the ipmi drivers. +# + +export-objs := ipmi_msghandler.o ipmi_watchdog.o + +ipmi_kcs_drv-objs := ipmi_kcs_sm.o ipmi_kcs_intf.o + +obj-$(CONFIG_IPMI_HANDLER) += ipmi_msghandler.o +obj-$(CONFIG_IPMI_DEVICE_INTERFACE) += ipmi_devintf.o +obj-$(CONFIG_IPMI_KCS) += ipmi_kcs_drv.o +obj-$(CONFIG_IPMI_WATCHDOG) += ipmi_watchdog.o + +include $(TOPDIR)/Rules.make + +ipmi_kcs_drv.o: $(ipmi_kcs_drv-objs) + $(LD) -r -o $@ $(ipmi_kcs_drv-objs) diff -urN linux.orig/drivers/char/ipmi/ipmi_devintf.c linux/drivers/char/ipmi/ipmi_devintf.c --- linux.orig/drivers/char/ipmi/ipmi_devintf.c Wed Dec 31 18:00:00 1969 +++ linux/drivers/char/ipmi/ipmi_devintf.c Wed Oct 30 13:51:55 2002 @@ -0,0 +1,539 @@ +/* + * ipmi_devintf.c + * + * Linux device interface for the IPMI message handler. + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include <linux/config.h> +#include <linux/module.h> +#include <linux/errno.h> +#include <asm/system.h> +#include <linux/sched.h> +#include <linux/poll.h> +#include <linux/spinlock.h> +#include <linux/slab.h> +#include <linux/devfs_fs_kernel.h> +#include <linux/ipmi.h> +#include <asm/semaphore.h> +#include <linux/init.h> + +struct ipmi_file_private +{ + ipmi_user_t user; + spinlock_t recv_msg_lock; + struct list_head recv_msgs; + struct file *file; + struct fasync_struct *fasync_queue; + wait_queue_head_t wait; + struct semaphore recv_sem; +}; + +static void file_receive_handler(struct ipmi_recv_msg *msg, + void *handler_data) +{ + struct ipmi_file_private *priv = handler_data; + int was_empty; + unsigned long flags; + + spin_lock_irqsave(&(priv->recv_msg_lock), flags); + + was_empty = list_empty(&(priv->recv_msgs)); + list_add_tail(&(msg->link), &(priv->recv_msgs)); + + if (was_empty) { + wake_up_interruptible(&priv->wait); + kill_fasync(&priv->fasync_queue, SIGIO, POLL_IN); + } + + spin_unlock_irqrestore(&(priv->recv_msg_lock), flags); +} + +static unsigned int ipmi_poll(struct file *file, poll_table *wait) +{ + struct ipmi_file_private *priv = file->private_data; + unsigned int mask = 0; + unsigned long flags; + + spin_lock_irqsave(&priv->recv_msg_lock, flags); + + poll_wait(file, &priv->wait, wait); + + if (! list_empty(&(priv->recv_msgs))) + mask |= (POLLIN | POLLRDNORM); + + spin_unlock_irqrestore(&priv->recv_msg_lock, flags); + + return mask; +} + +static int ipmi_fasync(int fd, struct file *file, int on) +{ + struct ipmi_file_private *priv = file->private_data; + int result; + + result = fasync_helper(fd, file, on, &priv->fasync_queue); + + return (result); +} + +static struct ipmi_user_hndl ipmi_hndlrs = +{ + ipmi_recv_hndl : file_receive_handler +}; + +static int ipmi_open(struct inode *inode, struct file *file) +{ + int if_num = minor(inode->i_rdev); + int rv; + struct ipmi_file_private *priv; + + + MOD_INC_USE_COUNT; + + priv = kmalloc(sizeof(*priv), GFP_KERNEL); + if (!priv) { + MOD_DEC_USE_COUNT; + return -ENOMEM; + } + + priv->file = file; + + rv = ipmi_create_user(if_num, + &ipmi_hndlrs, + priv, + &(priv->user)); + if (rv) { + kfree(priv); + MOD_DEC_USE_COUNT; + return rv; + } + + file->private_data = priv; + + spin_lock_init(&(priv->recv_msg_lock)); + INIT_LIST_HEAD(&(priv->recv_msgs)); + init_waitqueue_head(&priv->wait); + priv->fasync_queue = NULL; + sema_init(&(priv->recv_sem), 1); + + return 0; +} + +static int ipmi_release(struct inode *inode, struct file *file) +{ + struct ipmi_file_private *priv = file->private_data; + int rv; + + rv = ipmi_destroy_user(priv->user); + if (rv) + return rv; + + ipmi_fasync (-1, file, 0); + + /* FIXME - free the messages in the list. */ + kfree(priv); + + MOD_DEC_USE_COUNT; + + return 0; +} + +static int ipmi_ioctl(struct inode *inode, + struct file *file, + unsigned int cmd, + unsigned long data) +{ + int rv = -EINVAL; + struct ipmi_file_private *priv = file->private_data; + + switch (cmd) + { + case IPMICTL_SEND_COMMAND: + { + struct ipmi_req req; + struct ipmi_addr addr; + unsigned char msgdata[IPMI_MAX_MSG_LENGTH]; + + if (copy_from_user(&req, (void *) data, sizeof(req))) { + rv = -EFAULT; + break; + } + + if (req.addr_len > sizeof(struct ipmi_addr)) + { + rv = -EINVAL; + break; + } + + if (copy_from_user(&addr, req.addr, req.addr_len)) { + rv = -EFAULT; + break; + } + + rv = ipmi_validate_addr(&addr, req.addr_len); + if (rv) + break; + + if (req.msg.data != NULL) { + if (req.msg.data_len > IPMI_MAX_MSG_LENGTH) { + rv = -EMSGSIZE; + break; + } + + if (copy_from_user(&msgdata, + req.msg.data, + req.msg.data_len)) + { + rv = -EFAULT; + break; + } + } else { + req.msg.data_len = 0; + } + + req.msg.data = msgdata; + + rv = ipmi_request(priv->user, + &addr, + req.msgid, + &(req.msg), + 0); + break; + } + + case IPMICTL_RECEIVE_MSG: + case IPMICTL_RECEIVE_MSG_TRUNC: + { + struct ipmi_recv rsp; + int addr_len; + struct list_head *entry; + struct ipmi_recv_msg *msg; + unsigned long flags; + + + rv = 0; + if (copy_from_user(&rsp, (void *) data, sizeof(rsp))) { + rv = -EFAULT; + break; + } + + /* We claim a semaphore because we don't want two + users getting something from the queue at a time. + Since we have to release the spinlock before we can + copy the data to the user, it's possible another + user will grab something from the queue, too. Then + the messages might get out of order if something + fails and the message gets put back onto the + queue. This semaphore prevents that problem. */ + down(&(priv->recv_sem)); + + /* Grab the message off the list. */ + spin_lock_irqsave(&(priv->recv_msg_lock), flags); + if (list_empty(&(priv->recv_msgs))) { + spin_unlock_irqrestore(&(priv->recv_msg_lock), flags); + rv = -EAGAIN; + goto recv_err; + } + entry = priv->recv_msgs.next; + msg = list_entry(entry, struct ipmi_recv_msg, link); + list_del(entry); + spin_unlock_irqrestore(&(priv->recv_msg_lock), flags); + + addr_len = ipmi_addr_length(msg->addr.addr_type); + if (rsp.addr_len < addr_len) + { + rv = -EINVAL; + goto recv_putback_on_err; + } + + if (copy_to_user(rsp.addr, &(msg->addr), addr_len)) { + rv = -EFAULT; + goto recv_putback_on_err; + } + rsp.addr_len = addr_len; + + rsp.recv_type = msg->recv_type; + rsp.msgid = msg->msgid; + rsp.msg.netfn = msg->msg.netfn; + rsp.msg.cmd = msg->msg.cmd; + + if (msg->msg.data_len > 0) { + if (rsp.msg.data_len < msg->msg.data_len) { + rv = -EMSGSIZE; + if (cmd == IPMICTL_RECEIVE_MSG_TRUNC) { + msg->msg.data_len = rsp.msg.data_len; + } else { + goto recv_putback_on_err; + } + } + + if (copy_to_user(rsp.msg.data, + msg->msg.data, + msg->msg.data_len)) + { + rv = -EFAULT; + goto recv_putback_on_err; + } + rsp.msg.data_len = msg->msg.data_len; + } else { + rsp.msg.data_len = 0; + } + + if (copy_to_user((void *) data, &rsp, sizeof(rsp))) { + rv = -EFAULT; + goto recv_putback_on_err; + } + + up(&(priv->recv_sem)); + ipmi_free_recv_msg(msg); + break; + + recv_putback_on_err: + /* If we got an error, put the message back onto + the head of the queue. */ + spin_lock_irqsave(&(priv->recv_msg_lock), flags); + list_add(entry, &(priv->recv_msgs)); + spin_unlock_irqrestore(&(priv->recv_msg_lock), flags); + up(&(priv->recv_sem)); + break; + + recv_err: + up(&(priv->recv_sem)); + break; + } + + case IPMICTL_REGISTER_FOR_CMD: + { + struct ipmi_cmdspec val; + + if (copy_from_user(&val, (void *) data, sizeof(val))) { + rv = -EFAULT; + break; + } + + rv = ipmi_register_for_cmd(priv->user, val.netfn, val.cmd); + break; + } + + case IPMICTL_UNREGISTER_FOR_CMD: + { + struct ipmi_cmdspec val; + + if (copy_from_user(&val, (void *) data, sizeof(val))) { + rv = -EFAULT; + break; + } + + rv = ipmi_unregister_for_cmd(priv->user, val.netfn, val.cmd); + break; + } + + case IPMICTL_SET_GETS_EVENTS_CMD: + { + int val; + + if (copy_from_user(&val, (void *) data, sizeof(val))) { + rv = -EFAULT; + break; + } + + rv = ipmi_set_gets_events(priv->user, val); + break; + } + + case IPMICTL_SET_MY_ADDRESS_CMD: + { + unsigned int val; + + if (copy_from_user(&val, (void *) data, sizeof(val))) { + rv = -EFAULT; + break; + } + + ipmi_set_my_address(priv->user, val); + rv = 0; + break; + } + + case IPMICTL_GET_MY_ADDRESS_CMD: + { + unsigned int val; + + val = ipmi_get_my_address(priv->user); + + if (copy_to_user((void *) data, &val, sizeof(val))) { + rv = -EFAULT; + break; + } + rv = 0; + break; + } + + case IPMICTL_SET_MY_LUN_CMD: + { + unsigned int val; + + if (copy_from_user(&val, (void *) data, sizeof(val))) { + rv = -EFAULT; + break; + } + + ipmi_set_my_LUN(priv->user, val); + rv = 0; + break; + } + + case IPMICTL_GET_MY_LUN_CMD: + { + unsigned int val; + + val = ipmi_get_my_LUN(priv->user); + + if (copy_to_user((void *) data, &val, sizeof(val))) { + rv = -EFAULT; + break; + } + rv = 0; + break; + } + + } + + return rv; +} + + +static struct file_operations ipmi_fops = { + owner: THIS_MODULE, + ioctl: ipmi_ioctl, + open: ipmi_open, + release: ipmi_release, + fasync: ipmi_fasync, + poll: ipmi_poll +}; + +#define DEVICE_NAME "ipmidev" + +static int ipmi_major = 0; +MODULE_PARM(ipmi_major, "i"); + +static devfs_handle_t devfs_handle; + +#define MAX_DEVICES 10 +static devfs_handle_t handles[MAX_DEVICES]; + +static void ipmi_new_smi(int if_num) +{ + char name[2]; + + if (if_num > MAX_DEVICES) + return; + + name[0] = if_num + '0'; + name[1] = '\0'; + + handles[if_num] = devfs_register(devfs_handle, name, DEVFS_FL_NONE, + ipmi_major, if_num, + S_IFCHR | S_IRUSR | S_IWUSR, + &ipmi_fops, NULL); +} + +static void ipmi_smi_gone(int if_num) +{ + if (if_num > MAX_DEVICES) + return; + + devfs_unregister(handles[if_num]); +} + +static struct ipmi_smi_watcher smi_watcher = +{ + new_smi : ipmi_new_smi, + smi_gone : ipmi_smi_gone +}; + +static __init int init_ipmi_devintf(void) +{ + int rv; + + if (ipmi_major < 0) + return -EINVAL; + + rv = register_chrdev(ipmi_major, DEVICE_NAME, &ipmi_fops); + if (rv < 0) { + printk(KERN_ERR "ipmi: can't get major %d\n", ipmi_major); + return rv; + } + + if (ipmi_major == 0) { + ipmi_major = rv; + } + + rv = ipmi_smi_watcher_register(&smi_watcher); + if (rv) { + unregister_chrdev(ipmi_major, DEVICE_NAME); + printk(KERN_WARNING "ipmi: can't register smi watcher"); + return rv; + } + + devfs_handle = devfs_mk_dir(NULL, DEVICE_NAME, NULL); + + printk(KERN_INFO "ipmi: device interface at char major %d\n", + ipmi_major); + + return 0; +} +module_init(init_ipmi_devintf); + +static __exit void cleanup_ipmi(void) +{ + ipmi_smi_watcher_unregister(&smi_watcher); + devfs_unregister(devfs_handle); + unregister_chrdev(ipmi_major, DEVICE_NAME); +} +module_exit(cleanup_ipmi); +#ifndef MODULE +static __init int ipmi_setup (char *str) +{ + int x; + + if (get_option (&str, &x)) { + /* ipmi=x sets the major number to x. */ + ipmi_major = x; + } else if (!strcmp(str, "off")) { + ipmi_major = -1; + } + + return 1; +} +#endif + +__setup("ipmi=", ipmi_setup); +MODULE_LICENSE("GPL"); diff -urN linux.orig/drivers/char/ipmi/ipmi_kcs_intf.c linux/drivers/char/ipmi/ipmi_kcs_intf.c --- linux.orig/drivers/char/ipmi/ipmi_kcs_intf.c Wed Dec 31 18:00:00 1969 +++ linux/drivers/char/ipmi/ipmi_kcs_intf.c Wed Oct 30 13:51:55 2002 @@ -0,0 +1,991 @@ +/* + * ipmi_kcs_intf.c + * + * The interface to the IPMI driver for the KCS. + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +/* + * This file holds the "policy" for the interface to the KCS state + * machine. It does the configuration, handles timers and interrupts, + * and drives the real KCS state machine. + */ + +#include <linux/config.h> +#include <linux/module.h> +#include <asm/system.h> +#include <linux/sched.h> +#include <linux/timer.h> +#include <linux/errno.h> +#include <linux/spinlock.h> +#include <linux/slab.h> +#include <linux/delay.h> +#include <linux/list.h> +#include <linux/ioport.h> +#ifdef CONFIG_HIGH_RES_TIMERS +#include <linux/hrtime.h> +#endif +#include <linux/ipmi_smi.h> +#include <asm/irq.h> +#include <asm/io.h> +#include "ipmi_kcs_sm.h" +#include <linux/init.h> + +/* Measure times between events in the driver. */ +#undef DEBUG_TIMING + +#ifdef CONFIG_IPMI_KCS +/* This forces a dependency to the config file for this option. */ +#endif + +enum kcs_intf_state { + KCS_NORMAL, + KCS_GETTING_FLAGS, + KCS_GETTING_EVENTS, + KCS_CLEARING_FLAGS, + KCS_CLEARING_FLAGS_THEN_SET_IRQ, + KCS_GETTING_MESSAGES, + KCS_ENABLE_INTERRUPTS1, + KCS_ENABLE_INTERRUPTS2 + /* FIXME - add watchdog stuff. */ +}; + +struct kcs_info +{ + ipmi_smi_t intf; + struct kcs_data *kcs_sm; + spinlock_t kcs_lock; + struct list_head xmit_msgs; + struct list_head hp_xmit_msgs; + struct ipmi_smi_msg *curr_msg; + enum kcs_intf_state kcs_state; + + /* Flags from the last GET_MSG_FLAGS command, used when an ATTN + is set to hold the flags until we are done handling everything + from the flags. */ +#define RECEIVE_MSG_AVAIL 0x01 +#define EVENT_MSG_BUFFER_FULL 0x02 +#define WDT_PRE_TIMEOUT_INT 0x08 + unsigned char msg_flags; + + /* If set to true, this will request events the next time the + state machine is idle. */ + int req_events; + + /* If true, run the state machine to completion on every send + call. Generally used after a panic to make sure stuff goes + out. */ + int run_to_completion; + + /* The I/O address of the KCS interface. */ + int addr; + + /* zero if no irq; */ + int irq; + + /* The timer for this kcs. */ + struct timer_list kcs_timer; + + /* The time (in jiffies) the last timeout occurred at. */ + unsigned long last_timeout_jiffies; + + /* Used to gracefully stop the timer without race conditions. */ + volatile int stop_operation; + volatile int timer_stopped; + + /* The driver will disable interrupts when it gets into a + situation where it cannot handle messages due to lack of + memory. Once that situation clears up, it will re-enable + interupts. */ + int interrupt_disabled; +}; + +static void deliver_recv_msg(struct kcs_info *kcs_info, struct ipmi_smi_msg *msg) +{ + /* Deliver the message to the upper layer with the lock + released. */ + spin_unlock(&(kcs_info->kcs_lock)); + ipmi_smi_msg_received(kcs_info->intf, msg); + spin_lock(&(kcs_info->kcs_lock)); +} + +static void return_hosed_msg(struct kcs_info *kcs_info) +{ + struct ipmi_smi_msg *msg = kcs_info->curr_msg; + + /* Make it a reponse */ + msg->rsp[0] = msg->data[0] | 4; + msg->rsp[1] = msg->data[1]; + msg->rsp[2] = 0xff; /* Unknown error. */ + msg->rsp_size = 3; + + deliver_recv_msg(kcs_info, msg); + kcs_info->curr_msg = NULL; +} + +static enum kcs_result start_next_msg(struct kcs_info *kcs_info) +{ + int rv; + struct list_head *entry = NULL; +#ifdef DEBUG_TIMING + struct timeval t; +#endif + + /* Pick the high priority queue first. */ + if (! list_empty(&(kcs_info->hp_xmit_msgs))) { + entry = kcs_info->hp_xmit_msgs.next; + } else if (! list_empty(&(kcs_info->xmit_msgs))) { + entry = kcs_info->xmit_msgs.next; + } + + if (!entry) { + kcs_info->curr_msg = NULL; + return KCS_SM_IDLE; + } else { + list_del(entry); + kcs_info->curr_msg = list_entry(entry, + struct ipmi_smi_msg, + link); +#ifdef DEBUG_TIMING + do_gettimeofday(&t); + printk("**Start2: %d.%9.9d\n", t.tv_sec, t.tv_usec); +#endif + rv = start_kcs_transaction(kcs_info->kcs_sm, + kcs_info->curr_msg->data, + kcs_info->curr_msg->data_size); + if (rv) { + return_hosed_msg(kcs_info); + } + + return KCS_CALL_WITHOUT_DELAY; + } +} + +static void start_enable_irq(struct kcs_info *kcs_info) +{ + unsigned char msg[2]; + + /* If we are enabling interrupts, we have to tell the + BMC to use them. */ + msg[0] = (IPMI_NETFN_APP_REQUEST << 2); + msg[1] = IPMI_GET_BMC_GLOBAL_ENABLES_CMD; + + start_kcs_transaction(kcs_info->kcs_sm, msg, 2); + kcs_info->kcs_state = KCS_ENABLE_INTERRUPTS1; +} + +static void start_clear_flags(struct kcs_info *kcs_info) +{ + unsigned char msg[3]; + + /* Make sure the watchdog pre-timeout flag is not set at startup. */ + msg[0] = (IPMI_NETFN_APP_REQUEST << 2); + msg[1] = IPMI_CLEAR_MSG_FLAGS_CMD; + msg[2] = WDT_PRE_TIMEOUT_INT; + + start_kcs_transaction(kcs_info->kcs_sm, msg, 3); + kcs_info->kcs_state = KCS_CLEARING_FLAGS; +} + +/* When we have a situtaion where we run out of memory and cannot + allocate messages, we just leave them in the BMC and run the system + polled until we can allocate some memory. Once we have some + memory, we will re-enable the interrupt. */ +static inline void disable_kcs_irq(struct kcs_info *kcs_info) +{ + if ((kcs_info->irq) && (!kcs_info->interrupt_disabled)) { + disable_irq_nosync(kcs_info->irq); + kcs_info->interrupt_disabled = 1; + } +} + +static inline void enable_kcs_irq(struct kcs_info *kcs_info) +{ + if ((kcs_info->irq) && (kcs_info->interrupt_disabled)) { + enable_irq(kcs_info->irq); + kcs_info->interrupt_disabled = 0; + } +} + +static void handle_flags(struct kcs_info *kcs_info) +{ + if (kcs_info->msg_flags & WDT_PRE_TIMEOUT_INT) { + /* Watchdog pre-timeout */ + start_clear_flags(kcs_info); + spin_unlock(&(kcs_info->kcs_lock)); + ipmi_smi_watchdog_pretimeout(kcs_info->intf); + spin_lock(&(kcs_info->kcs_lock)); + } else if (kcs_info->msg_flags & RECEIVE_MSG_AVAIL) { + /* Messages available. */ + kcs_info->curr_msg = ipmi_alloc_smi_msg(); + if (!kcs_info->curr_msg) { + disable_kcs_irq(kcs_info); + kcs_info->kcs_state = KCS_NORMAL; + return; + } + enable_kcs_irq(kcs_info); + + kcs_info->curr_msg->data[0] = (IPMI_NETFN_APP_REQUEST << 2); + kcs_info->curr_msg->data[1] = IPMI_GET_MSG_CMD; + kcs_info->curr_msg->data_size = 2; + + start_kcs_transaction(kcs_info->kcs_sm, + kcs_info->curr_msg->data, + kcs_info->curr_msg->data_size); + kcs_info->kcs_state = KCS_GETTING_MESSAGES; + } else if (kcs_info->msg_flags & EVENT_MSG_BUFFER_FULL) { + /* Events available. */ + kcs_info->curr_msg = ipmi_alloc_smi_msg(); + if (!kcs_info->curr_msg) { + disable_kcs_irq(kcs_info); + kcs_info->kcs_state = KCS_NORMAL; + return; + } + enable_kcs_irq(kcs_info); + + kcs_info->curr_msg->data[0] = (IPMI_NETFN_APP_REQUEST << 2); + kcs_info->curr_msg->data[1] = IPMI_READ_EVENT_MSG_BUFFER_CMD; + kcs_info->curr_msg->data_size = 2; + + start_kcs_transaction(kcs_info->kcs_sm, + kcs_info->curr_msg->data, + kcs_info->curr_msg->data_size); + kcs_info->kcs_state = KCS_GETTING_EVENTS; + } else { + kcs_info->kcs_state = KCS_NORMAL; + } +} + +static void handle_transaction_done(struct kcs_info *kcs_info) +{ +#ifdef DEBUG_TIMING + struct timeval t; + + do_gettimeofday(&t); + printk("**Done: %d.%9.9d\n", t.tv_sec, t.tv_usec); +#endif + switch (kcs_info->kcs_state) { + case KCS_NORMAL: + kcs_info->curr_msg->rsp_size + = kcs_get_result(kcs_info->kcs_sm, + kcs_info->curr_msg->rsp, + IPMI_MAX_MSG_LENGTH); + + deliver_recv_msg(kcs_info, kcs_info->curr_msg); + kcs_info->curr_msg = NULL; + break; + + case KCS_GETTING_FLAGS: + { + unsigned char msg[4]; + + /* We got the flags from the KCS, now handle them. */ + kcs_get_result(kcs_info->kcs_sm, msg, 4); + if (msg[2] != 0) { + /* Error fetching flags, just give up for + now. */ + kcs_info->kcs_state = KCS_NORMAL; + } else { + kcs_info->msg_flags = msg[3]; + handle_flags(kcs_info); + } + break; + } + + case KCS_CLEARING_FLAGS: + case KCS_CLEARING_FLAGS_THEN_SET_IRQ: + { + unsigned char msg[3]; + + /* We cleared the flags. */ + kcs_get_result(kcs_info->kcs_sm, msg, 3); + if (msg[2] != 0) { + /* Error clearing flags */ + printk("ipmi_kcs_intf: Error clearing flags: %2.2x\n", + msg[2]); + } + if (kcs_info->kcs_state == KCS_CLEARING_FLAGS_THEN_SET_IRQ) + start_enable_irq(kcs_info); + else + kcs_info->kcs_state = KCS_NORMAL; + break; + } + + case KCS_GETTING_EVENTS: + { + kcs_info->curr_msg->rsp_size + = kcs_get_result(kcs_info->kcs_sm, + kcs_info->curr_msg->rsp, + IPMI_MAX_MSG_LENGTH); + + if (kcs_info->curr_msg->rsp[2] != 0) { + /* Error getting event, probably done. */ + kcs_info->curr_msg->done(kcs_info->curr_msg); + + /* Take off the event flag. */ + kcs_info->msg_flags &= ~EVENT_MSG_BUFFER_FULL; + } else { + deliver_recv_msg(kcs_info, kcs_info->curr_msg); + } + kcs_info->curr_msg = NULL; + handle_flags(kcs_info); + break; + } + + case KCS_GETTING_MESSAGES: + { + kcs_info->curr_msg->rsp_size + = kcs_get_result(kcs_info->kcs_sm, + kcs_info->curr_msg->rsp, + IPMI_MAX_MSG_LENGTH); + + if (kcs_info->curr_msg->rsp[2] != 0) { + /* Error getting event, probably done. */ + kcs_info->curr_msg->done(kcs_info->curr_msg); + + /* Take off the msg flag. */ + kcs_info->msg_flags &= ~RECEIVE_MSG_AVAIL; + } else { + deliver_recv_msg(kcs_info, kcs_info->curr_msg); + } + kcs_info->curr_msg = NULL; + handle_flags(kcs_info); + break; + } + + case KCS_ENABLE_INTERRUPTS1: + { + unsigned char msg[4]; + + /* We got the flags from the KCS, now handle them. */ + kcs_get_result(kcs_info->kcs_sm, msg, 4); + if (msg[2] != 0) { + printk(KERN_WARNING + "ipmi_kcs: Could not enable interrupts" + ", failed get, using polled mode.\n"); + kcs_info->kcs_state = KCS_NORMAL; + } else { + msg[0] = (IPMI_NETFN_APP_REQUEST << 2); + msg[1] = IPMI_SET_BMC_GLOBAL_ENABLES_CMD; + msg[2] = msg[3] | 1; /* enable msg queue int */ + start_kcs_transaction(kcs_info->kcs_sm, msg,3); + kcs_info->kcs_state = KCS_ENABLE_INTERRUPTS2; + } + break; + } + + case KCS_ENABLE_INTERRUPTS2: + { + unsigned char msg[4]; + + /* We got the flags from the KCS, now handle them. */ + kcs_get_result(kcs_info->kcs_sm, msg, 4); + if (msg[2] != 0) { + printk(KERN_WARNING + "ipmi_kcs: Could not enable interrupts" + ", failed set, using polled mode.\n"); + } + kcs_info->kcs_state = KCS_NORMAL; + break; + } + } +} + +/* Called on timeouts and events. Timeouts should pass the elapsed + time, interrupts should pass in zero. */ +static enum kcs_result kcs_event_handler(struct kcs_info *kcs_info, int time) +{ + enum kcs_result kcs_result; + + restart: + /* There used to be a loop here that waited a little while + (around 25us) before giving up. That turned out to be + pointless, the minimum delays I was seeing were in the 300us + range, which is far too long to wait in an interrupt. So + we just run until the state machine tells us something + happened or it needs a delay. */ + kcs_result = kcs_event(kcs_info->kcs_sm, time); + time = 0; + while (kcs_result == KCS_CALL_WITHOUT_DELAY) + { + kcs_result = kcs_event(kcs_info->kcs_sm, 0); + } + + if (kcs_result == KCS_TRANSACTION_COMPLETE) + { + handle_transaction_done(kcs_info); + kcs_result = kcs_event(kcs_info->kcs_sm, 0); + } + else if (kcs_result == KCS_SM_HOSED) + { + if (kcs_info->curr_msg != NULL) { + /* If we were handling a user message, format + a response to send to the upper layer to + tell it about the error. */ + return_hosed_msg(kcs_info); + } + kcs_result = kcs_event(kcs_info->kcs_sm, 0); + kcs_info->kcs_state = KCS_NORMAL; + } + + /* We prefer handling attn over new messages. */ + if (kcs_result == KCS_ATTN) + { + unsigned char msg[2]; + + /* Got a attn, send down a get message flags to see + what's causing it. It would be better to handle + this in the upper layer, but due to the way + interrupts work with the KCS, that's not really + possible. */ + msg[0] = (IPMI_NETFN_APP_REQUEST << 2); + msg[1] = IPMI_GET_MSG_FLAGS_CMD; + + start_kcs_transaction(kcs_info->kcs_sm, msg, 2); + kcs_info->kcs_state = KCS_GETTING_FLAGS; + goto restart; + } + + /* If we are currently idle, try to start the next message. */ + if (kcs_result == KCS_SM_IDLE) { + kcs_result = start_next_msg(kcs_info); + if (kcs_result != KCS_SM_IDLE) + goto restart; + } + + if ((kcs_result == KCS_SM_IDLE) && (kcs_info->req_events)) { + /* We are idle and the upper layer requested that I fetch + events, so do so. */ + unsigned char msg[2]; + + kcs_info->req_events = 0; + msg[0] = (IPMI_NETFN_APP_REQUEST << 2); + msg[1] = IPMI_GET_MSG_FLAGS_CMD; + + start_kcs_transaction(kcs_info->kcs_sm, msg, 2); + kcs_info->kcs_state = KCS_GETTING_FLAGS; + goto restart; + } + + return kcs_result; +} + +static void sender(void *send_info, + struct ipmi_smi_msg *msg, + int priority) +{ + struct kcs_info *kcs_info = (struct kcs_info *) send_info; + enum kcs_result result; + unsigned long flags; +#ifdef DEBUG_TIMING + struct timeval t; +#endif + + spin_lock_irqsave(&(kcs_info->kcs_lock), flags); +#ifdef DEBUG_TIMING + do_gettimeofday(&t); + printk("**Enqueue: %d.%9.9d\n", t.tv_sec, t.tv_usec); +#endif + + if (kcs_info->run_to_completion) { + /* If we are running to completion, then throw it in + the list and run transactions until everything is + clear. Priority doesn't matter here. */ + list_add_tail(&(msg->link), &(kcs_info->xmit_msgs)); + result = kcs_event_handler(kcs_info, 0); + while (result != KCS_SM_IDLE) { + udelay(500); + result = kcs_event_handler(kcs_info, 500); + } + } else if ((kcs_info->kcs_state == KCS_NORMAL) + && (kcs_info->curr_msg == NULL)) + { + int rv; +#ifdef DEBUG_TIMING + do_gettimeofday(&t); + printk("**Start1: %d.%9.9d\n", t.tv_sec, t.tv_usec); +#endif + kcs_info->curr_msg = msg; + rv = start_kcs_transaction(kcs_info->kcs_sm, + kcs_info->curr_msg->data, + kcs_info->curr_msg->data_size); + /* If we get an error, put it in the queue to try again + later. */ + if (rv) { + kcs_info->curr_msg = NULL; + goto add_to_list; + } + } else { + add_to_list: + if (priority > 0) { + list_add_tail(&(msg->link), &(kcs_info->hp_xmit_msgs)); + } else { + list_add_tail(&(msg->link), &(kcs_info->xmit_msgs)); + } + } + + spin_unlock_irqrestore(&(kcs_info->kcs_lock), flags); +} + +static void set_run_to_completion(void *send_info, int i_run_to_completion) +{ + struct kcs_info *kcs_info = (struct kcs_info *) send_info; + enum kcs_result result; + unsigned long flags; + + spin_lock_irqsave(&(kcs_info->kcs_lock), flags); + + kcs_info->run_to_completion = i_run_to_completion; + if (i_run_to_completion) { + result = kcs_event_handler(kcs_info, 0); + while (result != KCS_SM_IDLE) { + udelay(500); + result = kcs_event_handler(kcs_info, 500); + } + } + + spin_unlock_irqrestore(&(kcs_info->kcs_lock), flags); +} + +static void request_events(void *send_info) +{ + struct kcs_info *kcs_info = (struct kcs_info *) send_info; + + kcs_info->req_events = 1; +} + +static void new_user(void *send_info) +{ + MOD_INC_USE_COUNT; +} + +static void user_left(void *send_info) +{ + MOD_DEC_USE_COUNT; +} + +/* Call every 10 ms. */ +#define KCS_TIMEOUT_TIME_USEC 10000 +#define KCS_USEC_PER_JIFFY (1000000/HZ) +#define KCS_TIMEOUT_JIFFIES (KCS_TIMEOUT_TIME_USEC/KCS_USEC_PER_JIFFY) +#define KCS_SHORT_TIMEOUT_USEC 500 /* .5ms when the SM request a + short timeout */ +static int initialized = 0; + +static void kcs_timeout(unsigned long data) +{ + struct kcs_info *kcs_info = (struct kcs_info *) data; + enum kcs_result kcs_result; + unsigned long flags; + unsigned long jiffies_now; + unsigned long time_diff; +#ifdef DEBUG_TIMING + struct timeval t; +#endif + + if (kcs_info->stop_operation) { + kcs_info->timer_stopped = 1; + return; + } + + spin_lock_irqsave(&(kcs_info->kcs_lock), flags); +#ifdef DEBUG_TIMING + do_gettimeofday(&t); + printk("**Timer: %d.%9.9d\n", t.tv_sec, t.tv_usec); +#endif + jiffies_now = jiffies; + time_diff = ((jiffies_now - kcs_info->last_timeout_jiffies) + * KCS_USEC_PER_JIFFY); + kcs_result = kcs_event_handler(kcs_info, time_diff); + + spin_unlock_irqrestore(&(kcs_info->kcs_lock), flags); + + kcs_info->last_timeout_jiffies = jiffies_now; + + if ((kcs_info->irq) && (! kcs_info->interrupt_disabled)) { + /* Running with interrupts, only do long timeouts. */ + kcs_info->kcs_timer.expires = jiffies + KCS_TIMEOUT_JIFFIES; + goto do_add_timer; + } + + /* If the state machine asks for a short delay, then shorten + the timer timeout. */ +#ifdef CONFIG_HIGH_RES_TIMERS + if (kcs_result == KCS_CALL_WITH_DELAY) { + kcs_info->kcs_timer.sub_expires + += usec_to_arch_cycles(KCS_SHORT_TIMEOUT_USEC); + while (kcs_info->kcs_timer.sub_expires >= cycles_per_jiffies) { + kcs_info->kcs_timer.expires++; + kcs_info->kcs_timer.sub_expires -= cycles_per_jiffies; + } + } else { + kcs_info->kcs_timer.expires = jiffies + KCS_TIMEOUT_JIFFIES; + } +#else + /* If we have a fast clock tick, we take advantage of it. */ + if (kcs_result == KCS_CALL_WITH_DELAY) { + kcs_info->kcs_timer.expires = jiffies + 1; + } else { + kcs_info->kcs_timer.expires = jiffies + KCS_TIMEOUT_JIFFIES; + } +#endif + + do_add_timer: + add_timer(&(kcs_info->kcs_timer)); +} + +static void kcs_irq_handler(int irq, void *data, struct pt_regs *regs) +{ + struct kcs_info *kcs_info = (struct kcs_info *) data; + unsigned long flags; +#ifdef DEBUG_TIMING + struct timeval t; +#endif + + spin_lock_irqsave(&(kcs_info->kcs_lock), flags); +#ifdef DEBUG_TIMING + do_gettimeofday(&t); + printk("**Interrupt: %d.%9.9d\n", t.tv_sec, t.tv_usec); +#endif + kcs_event_handler(kcs_info, 0); + spin_unlock_irqrestore(&(kcs_info->kcs_lock), flags); +} + +static struct ipmi_smi_handlers handlers = +{ + sender: sender, + request_events: request_events, + new_user: new_user, + user_left: user_left, + set_run_to_completion: set_run_to_completion +}; + +static unsigned char ipmi_kcs_dev_rev; +static unsigned char ipmi_kcs_fw_rev_major; +static unsigned char ipmi_kcs_fw_rev_minor; +static unsigned char ipmi_version_major; +static unsigned char ipmi_version_minor; + +extern int kcs_dbg; +static int ipmi_kcs_detect_hardware(int port, struct kcs_data *data) +{ + unsigned char msg[2]; + unsigned char resp[IPMI_MAX_MSG_LENGTH]; + unsigned long resp_len; + enum kcs_result kcs_result; + + /* It's impossible for the KCS status register to be all 1's, + but that's what you get from reading a bogus address, so we + test that first. */ + if (inb(port+1) == 0xff) + return -ENODEV; + + /* Do a Get Device ID command, since it comes back with some + useful info. */ + msg[0] = IPMI_NETFN_APP_REQUEST << 2; + msg[1] = IPMI_GET_DEVICE_ID_CMD; + start_kcs_transaction(data, msg, 2); + + kcs_result = kcs_event(data, 0); + for (;;) + { + if (kcs_result == KCS_CALL_WITH_DELAY) { + udelay(100); + kcs_result = kcs_event(data, 100); + } + else if (kcs_result == KCS_CALL_WITHOUT_DELAY) + { + kcs_result = kcs_event(data, 0); + } + else + break; + } + if (kcs_result == KCS_SM_HOSED) { + /* We couldn't get the state machine to run, so whatever's at + the port is probably not an IPMI KCS interface. */ + return -ENODEV; + } + /* Otherwise, we got some data. */ + resp_len = kcs_get_result(data, resp, IPMI_MAX_MSG_LENGTH); + if (resp_len < 6) + /* That's odd, it should be longer. */ + return -EINVAL; + + if ((resp[1] != IPMI_GET_DEVICE_ID_CMD) || (resp[2] != 0)) + /* That's odd, it shouldn't be able to fail. */ + return -EINVAL; + + ipmi_kcs_dev_rev = resp[4] & 0xf; + ipmi_kcs_fw_rev_major = resp[5] & 0x7f; + ipmi_kcs_fw_rev_minor = resp[6]; + ipmi_version_major = resp[7] & 0xf; + ipmi_version_minor = resp[7] >> 4; + + return 0; +} + +#define MAX_KCS_DRIVERS 4 +static struct kcs_info *kcs_info[MAX_KCS_DRIVERS]; + +#define DEVICE_NAME "ipmi_kcs" + +#define DEFAULT_IO_ADDR 0xca2 + +static int kcs_addrs[MAX_KCS_DRIVERS] = { 0, -1, -1, -1 }; +static int kcs_irqs[MAX_KCS_DRIVERS] = { 0, 0, 0, 0 }; + +MODULE_PARM(kcs_addrs, "1-4i"); +MODULE_PARM(kcs_irqs, "1-4i"); + +/* Returns 0 if initialized, or negative on an error. */ +static int init_one_kcs(int kcs_addr, int irq, struct kcs_info **kcs) +{ + int rv; + struct kcs_info *new_kcs; + + + /* The setup set the kcs address negative if it wants to + disable the driver. */ + if (kcs_addr < 0) + return -ENODEV; + + new_kcs = kmalloc(kcs_size(), GFP_KERNEL); + if (!new_kcs) { + printk(KERN_ERR "Unable to initialize KCS, out of memory\n"); + return -ENOMEM; + } + + /* So we know not to free it unless we have allocated one. */ + new_kcs->kcs_sm = NULL; + + if (kcs_addr == 0) { + kcs_addr = DEFAULT_IO_ADDR; + } + + new_kcs->addr = kcs_addr; + + if (request_region(kcs_addr, 2, DEVICE_NAME) == NULL) { + kfree(new_kcs); + printk(KERN_ERR "ipmi_kcs: Unable to get IO at 0x%4.4x\n", + kcs_addr); + return -EIO; + } + + new_kcs->kcs_sm = kmalloc(kcs_size(), GFP_KERNEL); + if (!new_kcs->kcs_sm) { + printk(KERN_ERR "Unable to initialize KCS, out of memory\n"); + rv = -ENOMEM; + goto out_err; + } + init_kcs_data(new_kcs->kcs_sm, kcs_addr); + spin_lock_init(&(new_kcs->kcs_lock)); + + rv = ipmi_kcs_detect_hardware(kcs_addr, new_kcs->kcs_sm); + if (rv) { + printk(KERN_ERR "ipmi_kcs_intf: Could not detect KCS" + " interface at 0x%4.4x\n", kcs_addr); + rv = -ENODEV; + goto out_err; + } + + if (irq != 0) { + rv = request_irq(irq, + kcs_irq_handler, + SA_INTERRUPT, + DEVICE_NAME, + new_kcs); + if (rv) { + printk(KERN_WARNING + "%s: Unable to claim interrupt %d," + " running polled\n", + DEVICE_NAME, irq); + irq = 0; + } + } + new_kcs->irq = irq; + + INIT_LIST_HEAD(&(new_kcs->xmit_msgs)); + INIT_LIST_HEAD(&(new_kcs->hp_xmit_msgs)); + new_kcs->curr_msg = NULL; + new_kcs->req_events = 0; + new_kcs->run_to_completion = 0; + + start_clear_flags(new_kcs); + + if (irq) { + new_kcs->kcs_state = KCS_CLEARING_FLAGS_THEN_SET_IRQ; + + printk(KERN_INFO "Initializing KCS driver at 0x%x, irq %d\n", + kcs_addr, irq); + + } else { + printk(KERN_INFO "Initializing KCS driver at 0x%x, no irq\n", + kcs_addr); + } + + rv = ipmi_register_smi(&handlers, + new_kcs, + ipmi_version_major, + ipmi_version_minor, + &(new_kcs->intf)); + if (rv) { + free_irq(irq, new_kcs); + printk(KERN_ERR "Unable to register IPMI KCS device: %d\n", + rv); + goto out_err; + } + + new_kcs->interrupt_disabled = 0; + new_kcs->timer_stopped = 0; + new_kcs->stop_operation = 0; + new_kcs->last_timeout_jiffies = jiffies; + + init_timer(&(new_kcs->kcs_timer)); + new_kcs->kcs_timer.data = (long) new_kcs; + new_kcs->kcs_timer.function = kcs_timeout; + new_kcs->kcs_timer.expires = jiffies + KCS_TIMEOUT_JIFFIES; + add_timer(&(new_kcs->kcs_timer)); + + *kcs = new_kcs; + + return 0; + + out_err: + release_region (kcs_addr, 2); + if (new_kcs->kcs_sm) + kfree(new_kcs->kcs_sm); + kfree(new_kcs); + return rv; +} + +static __init int init_ipmi_kcs(void) +{ + int rv; + int pos; + int i; + + if (initialized) + return 0; + initialized = 1; + + pos = 0; + for (i=0; i<MAX_KCS_DRIVERS; i++) { + rv = init_one_kcs(kcs_addrs[i], kcs_irqs[i], &(kcs_info[pos])); + if (rv == 0) { + pos++; + } + } + + if (kcs_info[0] == NULL) { + printk("ipmi_kcs_intf: Unable to find any KCS interfaces\n"); + return -ENODEV; + } else + return 0; +} + +#ifdef MODULE +void __exit cleanup_one_kcs(struct kcs_info *to_clean) +{ + int rv; + + if (! to_clean) + return; + + if (to_clean->irq != 0) + free_irq(to_clean->irq, to_clean); + + release_region (to_clean->addr, 2); + + /* Tell the timer to stop, then wait for it to stop. This avoids + problems with race conditions removing the timer here. */ + to_clean->stop_operation = 1; + while (!to_clean->timer_stopped) { + schedule_timeout(1); + } + + rv = ipmi_unregister_smi(to_clean->intf); + if (rv) { + printk(KERN_ERR "Unable to unregister IPMI KCS device: %d\n", + rv); + } + + initialized = 0; + + kfree(to_clean->kcs_sm); + kfree(to_clean); +} + +static __exit void cleanup_ipmi_kcs(void) +{ + int i; + + if (!initialized) + return; + + for (i=0; i<MAX_KCS_DRIVERS; i++) { + cleanup_one_kcs(kcs_info[i]); + } +} +module_exit(cleanup_ipmi_kcs); +#else +static int __init ipmi_kcs_setup(char *str) +{ + int val, rv = 2, pos; + char *s; + + + pos = 0; + while ((pos < 4) && (rv == 2)) { + rv = get_option(&str, &val); + if (rv == 0) { + s = strsep(&str, ","); + if (strcmp(s, "off") == 0) { + kcs_addrs[pos] = -1; + goto got_addr; + } else + break; + } else { + kcs_addrs[pos] = val; + if (rv == 2) { + got_addr: + rv = get_option(&str, &val); + if (rv) + kcs_irqs[pos] = val; + } + } + pos++; + } + + return 1; +} +__setup("ipmi_kcs=", ipmi_kcs_setup); +#endif + +module_init(init_ipmi_kcs); +MODULE_LICENSE("GPL"); diff -urN linux.orig/drivers/char/ipmi/ipmi_kcs_sm.c linux/drivers/char/ipmi/ipmi_kcs_sm.c --- linux.orig/drivers/char/ipmi/ipmi_kcs_sm.c Wed Dec 31 18:00:00 1969 +++ linux/drivers/char/ipmi/ipmi_kcs_sm.c Mon Oct 28 16:38:23 2002 @@ -0,0 +1,449 @@ +/* + * ipmi_kcs_sm.c + * + * State machine for handling IPMI KCS interfaces. + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +/* + * This state machine is taken from the state machine in the IPMI spec, + * pretty much verbatim. If you have questions about the states, see + * that document. + */ + +#include <asm/io.h> + +#include "ipmi_kcs_sm.h" + +/* Set this if you want a printout of why the state machine was hosed + when it gets hosed. */ +#define DEBUG_HOSED_REASON + +/* Print the state machine state on entry every time. */ +#undef DEBUG_STATE + +/* The states the KCS driver may be in. */ +enum kcs_states { + KCS_IDLE, /* The KCS interface is currently + doing nothing. */ + KCS_START_OP, /* We are starting an operation. The + data is in the output buffer, but + nothing has been done to the + interface yet. This was added to + the state machine in the spec to + wait for the initial IBF. */ + KCS_WAIT_WRITE_START, /* We have written a write cmd to the + interface. */ + KCS_WAIT_WRITE, /* We are writing bytes to the + interface. */ + KCS_WAIT_WRITE_END, /* We have written the write end cmd + to the interface, and still need to + write the last byte. */ + KCS_WAIT_READ, /* We are waiting to read data from + the interface. */ + KCS_ERROR0, /* State to transition to the error + handler, this was added to the + state machine in the spec to be + sure IBF was there. */ + KCS_ERROR1, /* First stage error handler, wait for + the interface to respond. */ + KCS_ERROR2, /* The abort cmd has been written, + wait for the interface to + respond. */ + KCS_ERROR3, /* We wrote some data to the + interface, wait for it to switch to + read mode. */ + KCS_HOSED /* The hardware failed to follow the + state machine. */ +}; + +#define MAX_KCS_READ_SIZE 80 +#define MAX_KCS_WRITE_SIZE 80 + +/* Timeouts in microseconds. */ +#define IBF_RETRY_TIMEOUT 1000000 +#define OBF_RETRY_TIMEOUT 1000000 +#define MAX_ERROR_RETRIES 10 + +#define IPMI_ERR_MSG_TRUNCATED 0xc6 + +struct kcs_data +{ + enum kcs_states state; + unsigned int port; + unsigned char write_data[MAX_KCS_WRITE_SIZE]; + int write_pos; + int write_count; + int orig_write_count; + unsigned char read_data[MAX_KCS_READ_SIZE]; + int read_pos; + int truncated; + + unsigned int error_retries; + long ibf_timeout; + long obf_timeout; +}; + +void init_kcs_data(struct kcs_data *kcs, unsigned int port) +{ + kcs->state = KCS_IDLE; + kcs->port = port; + kcs->write_pos = 0; + kcs->write_count = 0; + kcs->orig_write_count = 0; + kcs->read_pos = 0; + kcs->error_retries = 0; + kcs->truncated = 0; + kcs->ibf_timeout = IBF_RETRY_TIMEOUT; + kcs->obf_timeout = OBF_RETRY_TIMEOUT; +} + +static inline unsigned char read_status(struct kcs_data *kcs) +{ + return inb(kcs->port + 1); +} + +static inline unsigned char read_data(struct kcs_data *kcs) +{ + return inb(kcs->port + 0); +} + +static inline void write_cmd(struct kcs_data *kcs, unsigned char data) +{ + outb(data, kcs->port + 1); +} + +static inline void write_data(struct kcs_data *kcs, unsigned char data) +{ + outb(data, kcs->port + 0); +} + +/* Control codes. */ +#define KCS_GET_STATUS_ABORT 0x60 +#define KCS_WRITE_START 0x61 +#define KCS_WRITE_END 0x62 +#define KCS_READ_BYTE 0x68 + +/* Status bits. */ +#define GET_STATUS_STATE(status) (((status) >> 6) & 0x03) +#define KCS_IDLE_STATE 0 +#define KCS_READ_STATE 1 +#define KCS_WRITE_STATE 2 +#define KCS_ERROR_STATE 3 +#define GET_STATUS_ATN(status) ((status) & 0x04) +#define GET_STATUS_IBF(status) ((status) & 0x02) +#define GET_STATUS_OBF(status) ((status) & 0x01) + + +static inline void write_next_byte(struct kcs_data *kcs) +{ + write_data(kcs, kcs->write_data[kcs->write_pos]); + (kcs->write_pos)++; + (kcs->write_count)--; +} + +static inline void start_error_recovery(struct kcs_data *kcs, char *reason) +{ + (kcs->error_retries)++; + if (kcs->error_retries > MAX_ERROR_RETRIES) { +#ifdef DEBUG_HOSED_REASON + printk("ipmi_kcs_sm: kcs hosed: %s\n", reason); +#endif + kcs->state = KCS_HOSED; + } else { + kcs->state = KCS_ERROR0; + } +} + +static inline void read_next_byte(struct kcs_data *kcs) +{ + if (kcs->read_pos >= MAX_KCS_READ_SIZE) { + /* Throw the data away and mark it truncated. */ + read_data(kcs); + kcs->truncated = 1; + } else { + kcs->read_data[kcs->read_pos] = read_data(kcs); + (kcs->read_pos)++; + } + write_data(kcs, KCS_READ_BYTE); +} + +static inline int check_ibf(struct kcs_data *kcs, + unsigned char status, + long time) +{ + if (GET_STATUS_IBF(status)) { + kcs->ibf_timeout -= time; + if (kcs->ibf_timeout < 0) { + start_error_recovery(kcs, "IBF not ready in time"); + return 1; + } + return 0; + } + kcs->ibf_timeout = IBF_RETRY_TIMEOUT; + return 1; +} + +static inline int check_obf(struct kcs_data *kcs, + unsigned char status, + long time) +{ + if (! GET_STATUS_OBF(status)) { + kcs->obf_timeout -= time; + if (kcs->obf_timeout < 0) { + start_error_recovery(kcs, "OBF not ready in time"); + return 1; + } + return 0; + } + kcs->obf_timeout = OBF_RETRY_TIMEOUT; + return 1; +} + +static void clear_obf(struct kcs_data *kcs, unsigned char status) +{ + if (GET_STATUS_OBF(status)) + read_data(kcs); +} + +static void restart_kcs_transaction(struct kcs_data *kcs) +{ + kcs->write_count = kcs->orig_write_count; + kcs->write_pos = 0; + kcs->read_pos = 0; + kcs->state = KCS_WAIT_WRITE_START; + kcs->ibf_timeout = IBF_RETRY_TIMEOUT; + kcs->obf_timeout = OBF_RETRY_TIMEOUT; + write_cmd(kcs, KCS_WRITE_START); +} + +int start_kcs_transaction(struct kcs_data *kcs, char *data, unsigned int size) +{ + if ((size < 2) || (size > MAX_KCS_WRITE_SIZE)) { + return -1; + } + + if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) { + return -2; + } + + kcs->error_retries = 0; + memcpy(kcs->write_data, data, size); + kcs->write_count = size; + kcs->orig_write_count = size; + kcs->write_pos = 0; + kcs->read_pos = 0; + kcs->state = KCS_START_OP; + kcs->ibf_timeout = IBF_RETRY_TIMEOUT; + kcs->obf_timeout = OBF_RETRY_TIMEOUT; + return 0; +} + +int kcs_get_result(struct kcs_data *kcs, unsigned char *data, int length) +{ + if (length < kcs->read_pos) { + kcs->read_pos = length; + kcs->truncated = 1; + } + + memcpy(data, kcs->read_data, kcs->read_pos); + + if (kcs->truncated) { + /* Report a truncated error. We might overwrite + another error, but that's too bad, the user needs + to know it was truncated. */ + data[2] = IPMI_ERR_MSG_TRUNCATED; + kcs->truncated = 0; + } + + return kcs->read_pos; +} + +/* This implements the state machine defined in the IPMI manual, see + that for details on how this works. */ +enum kcs_result kcs_event(struct kcs_data *kcs, long time) +{ + unsigned char status; + unsigned char state; + + status = read_status(kcs); + +#ifdef DEBUG_STATE + printk(" State = %d, %x\n", kcs->state, status); +#endif + /* All states wait for ibf, so just do it here. */ + if (!check_ibf(kcs, status, time)) + return KCS_CALL_WITH_DELAY; + + /* Just about everything looks at the KCS state, so grab that, too. */ + state = GET_STATUS_STATE(status); + + switch (kcs->state) { + case KCS_IDLE: + if (GET_STATUS_ATN(status)) + return KCS_ATTN; + else + return KCS_SM_IDLE; + + case KCS_START_OP: + if (state != KCS_IDLE) { + start_error_recovery(kcs, + "State machine not idle at start"); + break; + } + + clear_obf(kcs, status); + write_cmd(kcs, KCS_WRITE_START); + kcs->state = KCS_WAIT_WRITE_START; + break; + + case KCS_WAIT_WRITE_START: + if (state != KCS_WRITE_STATE) { + start_error_recovery( + kcs, + "Not in write state at write start"); + break; + } + read_data(kcs); + if (kcs->write_count == 1) { + write_cmd(kcs, KCS_WRITE_END); + kcs->state = KCS_WAIT_WRITE_END; + } else { + write_next_byte(kcs); + kcs->state = KCS_WAIT_WRITE; + } + break; + + case KCS_WAIT_WRITE: + if (state != KCS_WRITE_STATE) { + start_error_recovery(kcs, + "Not in write state for write"); + break; + } + clear_obf(kcs, status); + if (kcs->write_count == 1) { + write_cmd(kcs, KCS_WRITE_END); + kcs->state = KCS_WAIT_WRITE_END; + } else { + write_next_byte(kcs); + } + break; + + case KCS_WAIT_WRITE_END: + if (state != KCS_WRITE_STATE) { + start_error_recovery(kcs, + "Not in write state for write end"); + break; + } + clear_obf(kcs, status); + write_next_byte(kcs); + kcs->state = KCS_WAIT_READ; + break; + + case KCS_WAIT_READ: + if ((state != KCS_READ_STATE) && (state != KCS_IDLE_STATE)) { + start_error_recovery( + kcs, + "Not in read or idle in read state"); + break; + } + if (! check_obf(kcs, status, time)) + return KCS_CALL_WITH_DELAY; + + if (state == KCS_READ_STATE) { + read_next_byte(kcs); + } else { + read_data(kcs); + kcs->orig_write_count = 0; + kcs->state = KCS_IDLE; + return KCS_TRANSACTION_COMPLETE; + } + break; + + case KCS_ERROR0: + clear_obf(kcs, status); + write_cmd(kcs, KCS_GET_STATUS_ABORT); + kcs->state = KCS_ERROR1; + break; + + case KCS_ERROR1: + clear_obf(kcs, status); + write_data(kcs, 0); + kcs->state = KCS_ERROR2; + break; + + case KCS_ERROR2: + if (state != KCS_READ_STATE) { + start_error_recovery(kcs, + "Not in read state for error2"); + break; + } + if (! check_obf(kcs, status, time)) + return KCS_CALL_WITH_DELAY; + + clear_obf(kcs, status); + write_data(kcs, KCS_READ_BYTE); + kcs->state = KCS_ERROR3; + break; + + case KCS_ERROR3: + if (state != KCS_IDLE_STATE) { + start_error_recovery(kcs, + "Not in idle state for error3"); + break; + } + + if (! check_obf(kcs, status, time)) + return KCS_CALL_WITH_DELAY; + + clear_obf(kcs, status); + if (kcs->orig_write_count) { + restart_kcs_transaction(kcs); + } else { + kcs->state = KCS_IDLE; + return KCS_TRANSACTION_COMPLETE; + } + break; + + case KCS_HOSED: + return KCS_SM_HOSED; + } + + if (kcs->state == KCS_HOSED) { + init_kcs_data(kcs, kcs->port); + return KCS_SM_HOSED; + } + + return KCS_CALL_WITHOUT_DELAY; +} + +int kcs_size(void) +{ + return sizeof(struct kcs_data); +} diff -urN linux.orig/drivers/char/ipmi/ipmi_kcs_sm.h linux/drivers/char/ipmi/ipmi_kcs_sm.h --- linux.orig/drivers/char/ipmi/ipmi_kcs_sm.h Wed Dec 31 18:00:00 1969 +++ linux/drivers/char/ipmi/ipmi_kcs_sm.h Sun Oct 13 16:25:50 2002 @@ -0,0 +1,69 @@ +/* + * ipmi_kcs_sm.h + * + * State machine for handling IPMI KCS interfaces. + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +struct kcs_data; + +void init_kcs_data(struct kcs_data *kcs, + unsigned int port); + +/* Start a new transaction in the state machine. This will return -2 + if the state machine is not idle, -1 if the size is invalid (to + large or too small), or 0 if the transaction is successfully + completed. */ +int start_kcs_transaction(struct kcs_data *kcs, char *data, unsigned int size); + +/* Return the results after the transaction. This will return -1 if + the buffer is too small, zero if no transaction is present, or the + actual length of the result data. */ +int kcs_get_result(struct kcs_data *kcs, unsigned char *data, int length); + +enum kcs_result +{ + KCS_CALL_WITHOUT_DELAY, /* Call the driver again immediately */ + KCS_CALL_WITH_DELAY, /* Delay some before calling again. */ + KCS_TRANSACTION_COMPLETE, /* A transaction is finished. */ + KCS_SM_IDLE, /* The SM is in idle state. */ + KCS_SM_HOSED, /* The hardware violated the state machine. */ + KCS_ATTN /* The hardware is asserting attn and the + state machine is idle. */ +}; + +/* Call this periodically (for a polled interface) or upon receiving + an interrupt (for a interrupt-driven interface). If interrupt + driven, you should probably poll this periodically when not in idle + state. This should be called with the time that passed since the + last call, if it is significant. Time is in microseconds. */ +enum kcs_result kcs_event(struct kcs_data *kcs, long time); + +/* Return the size of the KCS structure in bytes. */ +int kcs_size(void); diff -urN linux.orig/drivers/char/ipmi/ipmi_msghandler.c linux/drivers/char/ipmi/ipmi_msghandler.c --- linux.orig/drivers/char/ipmi/ipmi_msghandler.c Wed Dec 31 18:00:00 1969 +++ linux/drivers/char/ipmi/ipmi_msghandler.c Wed Oct 30 13:51:55 2002 @@ -0,0 +1,1797 @@ +/* + * ipmi_msghandler.c + * + * Incoming and outgoing message routing for an IPMI interface. + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include <linux/config.h> +#include <linux/module.h> +#include <linux/errno.h> +#include <asm/system.h> +#include <linux/sched.h> +#include <linux/poll.h> +#include <linux/spinlock.h> +#include <linux/slab.h> +#include <linux/ipmi.h> +#include <linux/ipmi_smi.h> +#include <linux/notifier.h> +#include <linux/init.h> + +struct ipmi_recv_msg *ipmi_alloc_recv_msg(void); +static int ipmi_init_msghandler(void); + +static int initialized = 0; + +#define MAX_EVENTS_IN_QUEUE 25 + +struct ipmi_user +{ + struct list_head link; + + /* The upper layer that handles receive messages. */ + struct ipmi_user_hndl *handler; + void *handler_data; + + /* The interface this user is bound to. */ + ipmi_smi_t intf; + + /* Does this interface receive IPMI events? */ + int gets_events; +}; + +struct cmd_rcvr +{ + struct list_head link; + + ipmi_user_t user; + unsigned char netfn; + unsigned char cmd; +}; + +#define IPMI_IPMB_NUM_SEQ 64 +struct ipmi_smi +{ + /* The list of upper layers that are using me. We read-lock + this when delivering messages to the upper layer to keep + the user from going away while we are processing the + message. This means that you cannot add or delete a user + from the receive callback. */ + rwlock_t users_lock; + struct list_head users; + + /* The IPMI version of the BMC on the other end. */ + unsigned char version_major; + unsigned char version_minor; + + /* This is the lower-layer's sender routine. */ + struct ipmi_smi_handlers *handlers; + void *send_info; + + /* A table of sequence numbers for this interface. We use the + sequence numbers for IPMB messages that go out of the + interface to match them up with their responses. A routine + is called periodically to time the items in this list. */ + spinlock_t seq_lock; + struct { + unsigned long timeout; + int inuse; + struct ipmi_recv_msg *recv_msg; + } seq_table[IPMI_IPMB_NUM_SEQ]; + int curr_seq; + + /* Messages that were delayed for some reason (out of memory, + for instance), will go in here to be processed later in a + periodic timer interrupt. */ + spinlock_t waiting_msgs_lock; + struct list_head waiting_msgs; + + /* The list of command receivers that are registered for commands + on this interface. */ + rwlock_t cmd_rcvr_lock; + struct list_head cmd_rcvrs; + + /* Events that were queues because no one was there to receive + them. */ + spinlock_t events_lock; /* For dealing with event stuff. */ + struct list_head waiting_events; + unsigned int waiting_events_count; /* How many events in queue? */ + + /* This will be non-null if someone registers to receive all + IPMI commands (this is for interface emulation). There + may not be any things in the cmd_rcvrs list above when + this is registered. */ + ipmi_user_t all_cmd_rcvr; + + /* My slave address. This is initialized to IPMI_BMC_SLAVE_ADDR, + but may be changed by the user. */ + unsigned char my_address; + + /* My LUN. This should generally stay the SMS LUN, but just in + case... */ + unsigned char my_lun; +}; + +int +ipmi_register_all_cmd_rcvr(ipmi_user_t user) +{ + int flags; + int rv = -EBUSY; + + write_lock_irqsave(&(user->intf->users_lock), flags); + write_lock(&(user->intf->cmd_rcvr_lock)); + if ((user->intf->all_cmd_rcvr == NULL) + && (list_empty(&(user->intf->cmd_rcvrs)))) + { + user->intf->all_cmd_rcvr = user; + rv = 0; + } + write_unlock(&(user->intf->cmd_rcvr_lock)); + write_unlock_irqrestore(&(user->intf->users_lock), flags); + return rv; +} + +int +ipmi_unregister_all_cmd_rcvr(ipmi_user_t user) +{ + int flags; + int rv = -EINVAL; + + write_lock_irqsave(&(user->intf->users_lock), flags); + write_lock(&(user->intf->cmd_rcvr_lock)); + if (user->intf->all_cmd_rcvr == user) + { + user->intf->all_cmd_rcvr = NULL; + rv = 0; + } + write_unlock(&(user->intf->cmd_rcvr_lock)); + write_unlock_irqrestore(&(user->intf->users_lock), flags); + return rv; +} + + +#define MAX_IPMI_INTERFACES 4 +static ipmi_smi_t ipmi_interfaces[MAX_IPMI_INTERFACES]; +/* The locking for these for a write lock is done by two locks, first + the "outside" lock then the normal lock. This way, the interfaces + lock can be converted to a read lock without allowing a new write + locker to come in. Note that at interrupt level, this can only be + claimed read, so there is no reason for read lock to save + interrupts. Write locks must still save interrupts because they + can block an interrupt. */ +static rwlock_t interfaces_lock = RW_LOCK_UNLOCKED; +static spinlock_t interfaces_outside_lock = SPIN_LOCK_UNLOCKED; + +static struct list_head smi_watchers = LIST_HEAD_INIT(smi_watchers); +static rwlock_t smi_watcher_lock = RW_LOCK_UNLOCKED; + +int ipmi_smi_watcher_register(struct ipmi_smi_watcher *watcher) +{ + int i; + + read_lock(&interfaces_lock); + write_lock(&smi_watcher_lock); + list_add(&(watcher->link), &smi_watchers); + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + if (ipmi_interfaces[i] != NULL) { + watcher->new_smi(i); + } + } + write_unlock(&smi_watcher_lock); + read_unlock(&interfaces_lock); + return 0; +} + +int ipmi_smi_watcher_unregister(struct ipmi_smi_watcher *watcher) +{ + write_lock(&smi_watcher_lock); + list_del(&(watcher->link)); + write_unlock(&smi_watcher_lock); + return 0; +} + +int +ipmi_addr_equal(struct ipmi_addr *addr1, struct ipmi_addr *addr2) +{ + if (addr1->addr_type != addr2->addr_type) + return 0; + + if (addr1->channel != addr2->channel) + return 0; + + if (addr1->addr_type == IPMI_SYSTEM_INTERFACE_ADDR_TYPE) { + struct ipmi_system_interface_addr *smi_addr1 + = (struct ipmi_system_interface_addr *) addr1; + struct ipmi_system_interface_addr *smi_addr2 + = (struct ipmi_system_interface_addr *) addr2; + return (smi_addr1->lun == smi_addr2->lun); + } + + if ((addr1->addr_type == IPMI_IPMB_ADDR_TYPE) + || (addr1->addr_type == IPMI_IPMB_BROADCAST_ADDR_TYPE)) + { + struct ipmi_ipmb_addr *ipmb_addr1 + = (struct ipmi_ipmb_addr *) addr1; + struct ipmi_ipmb_addr *ipmb_addr2 + = (struct ipmi_ipmb_addr *) addr2; + + return ((ipmb_addr1->slave_addr == ipmb_addr2->slave_addr) + && (ipmb_addr1->lun == ipmb_addr2->lun)); + } + + return 1; +} + +int ipmi_validate_addr(struct ipmi_addr *addr, int len) +{ + if (len < sizeof(struct ipmi_system_interface_addr)) { + return -EINVAL; + } + + if (addr->addr_type == IPMI_SYSTEM_INTERFACE_ADDR_TYPE) { + if (addr->channel != IPMI_BMC_CHANNEL) + return -EINVAL; + return 0; + } + + if ((addr->channel == IPMI_BMC_CHANNEL) + || (addr->channel >= IPMI_NUM_CHANNELS) + || (addr->channel < 0)) + return -EINVAL; + + if ((addr->addr_type == IPMI_IPMB_ADDR_TYPE) + || (addr->addr_type == IPMI_IPMB_BROADCAST_ADDR_TYPE)) + { + if (len < sizeof(struct ipmi_ipmb_addr)) { + return -EINVAL; + } + return 0; + } + + return -EINVAL; +} + +unsigned int ipmi_addr_length(int addr_type) +{ + if (addr_type == IPMI_SYSTEM_INTERFACE_ADDR_TYPE) + return sizeof(struct ipmi_system_interface_addr); + + if ((addr_type == IPMI_IPMB_ADDR_TYPE) + || (addr_type == IPMI_IPMB_BROADCAST_ADDR_TYPE)) + { + return sizeof(struct ipmi_ipmb_addr); + } + + return 0; +} + +static void deliver_response(struct ipmi_recv_msg *msg) +{ + msg->user->handler->ipmi_recv_hndl(msg, msg->user->handler_data); +} + +/* Find the next sequence number not being used and add the given + message with the given timeout to the sequence table. */ +static int intf_next_seq(ipmi_smi_t intf, + struct ipmi_recv_msg *recv_msg, + unsigned long timeout, + unsigned char *seq) +{ + int rv = 0; + unsigned long flags; + unsigned int i; + + spin_lock_irqsave(&(intf->seq_lock), flags); + for (i=intf->curr_seq; + i!=(intf->curr_seq-1); + i=(i+1)%IPMI_IPMB_NUM_SEQ) + { + if (! intf->seq_table[i].inuse) + break; + } + + if (! intf->seq_table[i].inuse) { + intf->seq_table[i].recv_msg = recv_msg; + intf->seq_table[i].timeout = timeout; + intf->seq_table[i].inuse = 1; + *seq = i; + intf->curr_seq = (i+1)%IPMI_IPMB_NUM_SEQ; + } else { + rv = -EAGAIN; + } + + spin_unlock_irqrestore(&(intf->seq_lock), flags); + + return rv; +} + +/* Return the receive message for the given sequence number and + release the sequence number so it can be reused. Some other data + is passed in to be sure the message matches up correctly (to help + guard against message coming in after their timeout and the + sequence number being reused). */ +static int intf_find_seq(ipmi_smi_t intf, + unsigned char seq, + short channel, + unsigned char cmd, + unsigned char netfn, + struct ipmi_addr *addr, + struct ipmi_recv_msg **recv_msg) +{ + int rv = -ENODEV; + unsigned long flags; + + if (seq >= IPMI_IPMB_NUM_SEQ) + return -EINVAL; + + spin_lock_irqsave(&(intf->seq_lock), flags); + if (intf->seq_table[seq].inuse) { + struct ipmi_recv_msg *msg = intf->seq_table[seq].recv_msg; + + if ((msg->addr.channel == channel) + && (msg->msg.cmd == cmd) + && (msg->msg.netfn == netfn) + && (ipmi_addr_equal(addr, &(msg->addr)))) + { + *recv_msg = msg; + intf->seq_table[seq].inuse = 0; + rv = 0; + } + } + spin_unlock_irqrestore(&(intf->seq_lock), flags); + + return rv; +} + + +int ipmi_create_user(unsigned int if_num, + struct ipmi_user_hndl *handler, + void *handler_data, + ipmi_user_t *user) +{ + unsigned long flags; + ipmi_user_t new_user; + int rv = 0; + + if (handler == NULL) + return -EINVAL; + + MOD_INC_USE_COUNT; + + /* Make sure the driver is actually initialized, this handles + problems with initialization order. */ + if (!initialized) { + rv = ipmi_init_msghandler(); + if (rv) { + MOD_DEC_USE_COUNT; + return rv; + } + /* The init code doesn't return an error if it was turned + off, but it won't initialize. Check that. */ + if (!initialized) { + MOD_DEC_USE_COUNT; + return -ENODEV; + } + } + + new_user = kmalloc(sizeof(*new_user), GFP_KERNEL); + if (! new_user) { + MOD_DEC_USE_COUNT; + return -ENOMEM; + } + + read_lock(&interfaces_lock); + if ((if_num > MAX_IPMI_INTERFACES) || ipmi_interfaces[if_num] == NULL) + { + rv = -EINVAL; + goto out_unlock; + } + + new_user->handler = handler; + new_user->handler_data = handler_data; + new_user->intf = ipmi_interfaces[if_num]; + new_user->gets_events = 0; + + write_lock_irqsave(&(new_user->intf->users_lock), flags); + list_add_tail(&(new_user->link), &(new_user->intf->users)); + write_unlock_irqrestore(&(new_user->intf->users_lock), flags); + + out_unlock: + if (rv) { + MOD_DEC_USE_COUNT; + kfree(new_user); + } else { + new_user->intf->handlers->new_user(new_user->intf->send_info); + *user = new_user; + } + + read_unlock(&interfaces_lock); + return rv; +} + +static int ipmi_destroy_user_nolock(ipmi_user_t user) +{ + int rv = -ENODEV; + ipmi_user_t t_user; + struct list_head *entry, *entry2; + int i; + unsigned long flags; + + /* Find the user and delete them from the list. */ + list_for_each(entry, &(user->intf->users)) { + t_user = list_entry(entry, struct ipmi_user, link); + if (t_user == user) { + list_del(entry); + rv = 0; + break; + } + } + + if (rv) { + goto out_unlock; + } + + /* Remove the user from the interfaces sequence table. */ + spin_lock_irqsave(&(user->intf->seq_lock), flags); + for (i=0; i<IPMI_IPMB_NUM_SEQ; i++) { + if (user->intf->seq_table[i].inuse + && (user->intf->seq_table[i].recv_msg->user == user)) + { + user->intf->seq_table[i].inuse = 0; + } + } + spin_unlock_irqrestore(&(user->intf->seq_lock), flags); + + /* Remove the user from the command receiver's table. */ + write_lock_irqsave(&(user->intf->cmd_rcvr_lock), flags); + list_for_each_safe(entry, entry2, &(user->intf->cmd_rcvrs)) { + struct cmd_rcvr *rcvr; + rcvr = list_entry(entry, struct cmd_rcvr, link); + if (rcvr->user == user) { + list_del(entry); + kfree(rcvr); + } + } + write_unlock_irqrestore(&(user->intf->cmd_rcvr_lock), flags); + + kfree(user); + + out_unlock: + + return rv; +} + +int ipmi_destroy_user(ipmi_user_t user) +{ + int rv; + ipmi_smi_t intf = user->intf; + unsigned long flags; + + read_lock(&interfaces_lock); + write_lock_irqsave(&(intf->users_lock), flags); + rv = ipmi_destroy_user_nolock(user); + if (!rv) { + intf->handlers->user_left(intf->send_info); + MOD_DEC_USE_COUNT; + } + + write_unlock_irqrestore(&(intf->users_lock), flags); + read_unlock(&interfaces_lock); + return rv; +} + +void ipmi_get_version(ipmi_user_t user, + unsigned char *major, + unsigned char *minor) +{ + *major = user->intf->version_major; + *minor = user->intf->version_minor; +} + +void ipmi_set_my_address(ipmi_user_t user, + unsigned char address) +{ + user->intf->my_address = address; +} + +unsigned char ipmi_get_my_address(ipmi_user_t user) +{ + return user->intf->my_address; +} + +void ipmi_set_my_LUN(ipmi_user_t user, + unsigned char LUN) +{ + user->intf->my_lun = LUN & 0x3; +} + +unsigned char ipmi_get_my_LUN(ipmi_user_t user) +{ + return user->intf->my_lun; +} + +int ipmi_set_gets_events(ipmi_user_t user, int val) +{ + unsigned long flags; + struct list_head *e, *e2; + struct ipmi_recv_msg *msg; + + read_lock(&(user->intf->users_lock)); + spin_lock_irqsave(&(user->intf->events_lock), flags); + user->gets_events = val; + + if (val) { + /* Deliver any queued events. */ + list_for_each_safe(e, e2, &(user->intf->waiting_events)) { + msg = list_entry(e, struct ipmi_recv_msg, link); + list_del(e); + msg->user = user; + deliver_response(msg); + } + } + + spin_unlock_irqrestore(&(user->intf->events_lock), flags); + read_unlock(&(user->intf->users_lock)); + + return 0; +} + +int ipmi_register_for_cmd(ipmi_user_t user, + unsigned char netfn, + unsigned char cmd) +{ + struct list_head *entry; + unsigned long flags; + struct cmd_rcvr *rcvr; + int rv = 0; + + + rcvr = kmalloc(sizeof(*rcvr), GFP_KERNEL); + if (! rcvr) + return -ENOMEM; + + read_lock(&(user->intf->users_lock)); + write_lock_irqsave(&(user->intf->cmd_rcvr_lock), flags); + if (user->intf->all_cmd_rcvr != NULL) { + rv = -EBUSY; + goto out_unlock; + } + + /* Make sure the command/netfn is not already registered. */ + list_for_each(entry, &(user->intf->cmd_rcvrs)) { + rcvr = list_entry(entry, struct cmd_rcvr, link); + if ((rcvr->netfn == netfn) || (rcvr->cmd == cmd)) { + rv = -EBUSY; + break; + } + } + + if (! rv) { + rcvr->cmd = cmd; + rcvr->netfn = netfn; + rcvr->user = user; + list_add_tail(&(rcvr->link), &(user->intf->cmd_rcvrs)); + } + out_unlock: + write_unlock_irqrestore(&(user->intf->cmd_rcvr_lock), flags); + read_unlock(&(user->intf->users_lock)); + + if (rv) + kfree(rcvr); + + return rv; +} + +int ipmi_unregister_for_cmd(ipmi_user_t user, + unsigned char netfn, + unsigned char cmd) +{ + struct list_head *entry; + unsigned long flags; + struct cmd_rcvr *rcvr; + int rv = -ENOENT; + + read_lock(&(user->intf->users_lock)); + write_lock_irqsave(&(user->intf->cmd_rcvr_lock), flags); + /* Make sure the command/netfn is not already registered. */ + list_for_each(entry, &(user->intf->cmd_rcvrs)) { + rcvr = list_entry(entry, struct cmd_rcvr, link); + if ((rcvr->netfn == netfn) || (rcvr->cmd == cmd)) { + rv = 0; + list_del(entry); + kfree(rcvr); + break; + } + } + write_unlock_irqrestore(&(user->intf->cmd_rcvr_lock), flags); + read_unlock(&(user->intf->users_lock)); + + return rv; +} + +static unsigned char +ipmb_checksum(unsigned char *data, int size) +{ + unsigned char csum = 0; + + for (; size > 0; size--, data++) + csum += *data; + + return -csum; +} + +/* Separate from ipmi_request so that the user does not have to be + supplied in certain circumstances (mainly at panic time). If + messages are supplied, they will be freed, even if an error + occurs. */ +static inline int i_ipmi_request(ipmi_user_t user, + ipmi_smi_t intf, + struct ipmi_addr *addr, + long msgid, + struct ipmi_msg *msg, + void *supplied_smi, + struct ipmi_recv_msg *supplied_recv, + int priority, + unsigned char source_address, + unsigned char source_lun) +{ + int rv = 0; + struct ipmi_smi_msg *smi_msg; + struct ipmi_recv_msg *recv_msg; + + + if (supplied_recv) { + recv_msg = supplied_recv; + } else { + recv_msg = ipmi_alloc_recv_msg(); + if (recv_msg == NULL) { + return -ENOMEM; + } + } + + if (supplied_smi) { + smi_msg = (struct ipmi_smi_msg *) supplied_smi; + } else { + smi_msg = ipmi_alloc_smi_msg(); + if (smi_msg == NULL) { + ipmi_free_recv_msg(recv_msg); + return -ENOMEM; + } + } + + if (addr->channel > IPMI_NUM_CHANNELS) { + rv = -EINVAL; + goto out_err; + } + + if (addr->addr_type == IPMI_SYSTEM_INTERFACE_ADDR_TYPE) { + struct ipmi_system_interface_addr *smi_addr; + + smi_addr = (struct ipmi_system_interface_addr *) addr; + if (smi_addr->lun > 3) + return -EINVAL; + + if ((msg->netfn == IPMI_NETFN_APP_REQUEST) + && ((msg->cmd == IPMI_SEND_MSG_CMD) + || (msg->cmd == IPMI_GET_MSG_CMD) + || (msg->cmd == IPMI_READ_EVENT_MSG_BUFFER_CMD))) + { + /* We don't let the user do these, since we manage + the sequence numbers. */ + rv = -EINVAL; + goto out_err; + } + + if ((msg->data_len + 2) > IPMI_MAX_MSG_LENGTH) { + rv = -EMSGSIZE; + goto out_err; + } + + recv_msg->user = user; + recv_msg->addr = *addr; + recv_msg->msgid = msgid; + recv_msg->msg = *msg; + + smi_msg->data[0] = (msg->netfn << 2) | (smi_addr->lun & 0x3); + smi_msg->data[1] = msg->cmd; + smi_msg->msgid = msgid; + smi_msg->user_data = recv_msg; + if (msg->data_len > 0) + memcpy(&(smi_msg->data[2]), msg->data, msg->data_len); + smi_msg->data_size = msg->data_len + 2; + } else if ((addr->addr_type == IPMI_IPMB_ADDR_TYPE) + || (addr->addr_type == IPMI_IPMB_BROADCAST_ADDR_TYPE)) + { + struct ipmi_ipmb_addr *ipmb_addr; + unsigned char ipmb_seq; + int i; + + if (addr == NULL) { + rv = -EINVAL; + goto out_err; + } + + if (addr->addr_type == IPMI_IPMB_BROADCAST_ADDR_TYPE) { + /* Broadcasts add a zero at the beginning of the + message, but otherwise is the same as an IPMB + address. */ + smi_msg->data[3] = 0; + addr->addr_type = IPMI_IPMB_ADDR_TYPE; + i = 1; + } else { + i = 0; + } + + /* 9 for the header and 1 for the checksum, plus + possibly one for the broadcast. */ + if ((msg->data_len + 10 + i) > IPMI_MAX_MSG_LENGTH) { + rv = -EMSGSIZE; + goto out_err; + } + + ipmb_addr = (struct ipmi_ipmb_addr *) addr; + if (ipmb_addr->lun > 3) + return -EINVAL; + + memcpy(&(recv_msg->addr), ipmb_addr, sizeof(*ipmb_addr)); + + recv_msg->user = user; + recv_msg->msgid = msgid; + recv_msg->msg = *msg; + + if (recv_msg->msg.netfn & 0x1) { + /* It's a response, so use the user's sequence. */ + ipmb_seq = msgid; + } else { + /* It's a command, so get a sequence for it. */ + /* Create a sequence number with a 5 second timeout. */ + /* FIXME - magic number for the timeout. */ + rv = intf_next_seq(intf, + recv_msg, + 5000, + &ipmb_seq); + if (rv) { + /* We have used up all the sequence numbers, + probably, so abort. */ + ipmi_free_recv_msg(recv_msg); + smi_msg->done(smi_msg); + goto out_err; + } + } + + /* Format the IPMB header data. */ + smi_msg->data[0] = (IPMI_NETFN_APP_REQUEST << 2); + smi_msg->data[1] = IPMI_SEND_MSG_CMD; + smi_msg->data[2] = addr->channel; + smi_msg->data[i+3] = ipmb_addr->slave_addr; + smi_msg->data[i+4] = (msg->netfn << 2) | (ipmb_addr->lun & 0x3); + smi_msg->data[i+5] = ipmb_checksum(&(smi_msg->data[i+3]), 2); + smi_msg->data[i+6] = source_address; + smi_msg->data[i+7] = (ipmb_seq << 2) | source_lun; + smi_msg->data[i+8] = msg->cmd; + + /* Now tack on the data to the message. */ + if (msg->data_len > 0) + memcpy(&(smi_msg->data[i+9]), msg->data, msg->data_len); + smi_msg->data_size = msg->data_len + 9; + + /* Now calculate the checksum and tack it on. */ + smi_msg->data[i+smi_msg->data_size] + = ipmb_checksum(&(smi_msg->data[i+6]), smi_msg->data_size-6); + + /* Add on the checksum size and the offset from the + broadcast. */ + smi_msg->data_size += 1 + i; + + smi_msg->msgid = msgid; + } else { + /* Unknown address type. */ + rv = -EINVAL; + goto out_err; + } + + intf->handlers->sender(intf->send_info, smi_msg, priority); + + return 0; + + out_err: + smi_msg->done(smi_msg); + recv_msg->done(recv_msg); + return rv; +} + +int ipmi_request(ipmi_user_t user, + struct ipmi_addr *addr, + long msgid, + struct ipmi_msg *msg, + int priority) +{ + return i_ipmi_request(user, + user->intf, + addr, + msgid, + msg, + NULL, NULL, + priority, + user->intf->my_address, + user->intf->my_lun); +} + +int ipmi_request_supply_msgs(ipmi_user_t user, + struct ipmi_addr *addr, + long msgid, + struct ipmi_msg *msg, + void *supplied_smi, + struct ipmi_recv_msg *supplied_recv, + int priority) +{ + return i_ipmi_request(user, + user->intf, + addr, + msgid, + msg, + supplied_smi, + supplied_recv, + priority, + user->intf->my_address, + user->intf->my_lun); +} + +int ipmi_request_with_source(ipmi_user_t user, + struct ipmi_addr *addr, + long msgid, + struct ipmi_msg *msg, + int priority, + unsigned char source_address, + unsigned char source_lun) +{ + return i_ipmi_request(user, + user->intf, + addr, + msgid, + msg, + NULL, NULL, + priority, + source_address, + source_lun); +} + +int ipmi_register_smi(struct ipmi_smi_handlers *handlers, + void *send_info, + unsigned char version_major, + unsigned char version_minor, + ipmi_smi_t *intf) +{ + int i, j; + int rv; + unsigned long flags; + ipmi_smi_t new_intf; + struct list_head *entry; + + + /* Make sure the driver is actually initialized, this handles + problems with initialization order. */ + if (!initialized) { + rv = ipmi_init_msghandler(); + if (rv) + return rv; + /* The init code doesn't return an error if it was turned + off, but it won't initialize. Check that. */ + if (!initialized) + return -ENODEV; + } + + new_intf = kmalloc(sizeof(*new_intf), GFP_KERNEL); + if (!new_intf) + return -ENOMEM; + + rv = -ENOMEM; + + spin_lock_irqsave(&interfaces_outside_lock, flags); + write_lock(&interfaces_lock); + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + if (ipmi_interfaces[i] == NULL) { + new_intf->version_major = version_major; + new_intf->version_minor = version_minor; + new_intf->my_address = IPMI_BMC_SLAVE_ADDR; + new_intf->my_lun = 2; /* the SMS LUN. */ + rwlock_init(&(new_intf->users_lock)); + INIT_LIST_HEAD(&(new_intf->users)); + new_intf->handlers = handlers; + new_intf->send_info = send_info; + spin_lock_init(&(new_intf->seq_lock)); + for (j=0; j<IPMI_IPMB_NUM_SEQ; j++) + new_intf->seq_table[j].inuse = 0; + new_intf->curr_seq = 0; + spin_lock_init(&(new_intf->waiting_msgs_lock)); + INIT_LIST_HEAD(&(new_intf->waiting_msgs)); + spin_lock_init(&(new_intf->events_lock)); + INIT_LIST_HEAD(&(new_intf->waiting_events)); + new_intf->waiting_events_count = 0; + rwlock_init(&(new_intf->cmd_rcvr_lock)); + INIT_LIST_HEAD(&(new_intf->cmd_rcvrs)); + new_intf->all_cmd_rcvr = NULL; + MOD_INC_USE_COUNT; + + ipmi_interfaces[i] = new_intf; + + rv = 0; + *intf = new_intf; + break; + } + } + + /* This unusual lock combination allows us to convert the + interfaces lock to a read lock atomically. This way, we + can call the callbacks with the new interface without + having to worry about the interface going away, but still + letting them register and unregister users. */ + write_unlock(&interfaces_lock); + read_lock(&interfaces_lock); + spin_unlock_irqrestore(&interfaces_outside_lock, flags); + + if (rv == 0) { + /* Call all the watcher interfaces to tell them that a + new interface is available. */ + read_lock(&smi_watcher_lock); + list_for_each(entry, &smi_watchers) { + struct ipmi_smi_watcher *w; + w = list_entry(entry, struct ipmi_smi_watcher, link); + w->new_smi(i); + } + read_unlock(&smi_watcher_lock); + } + + read_unlock(&interfaces_lock); + + if (rv) + kfree(new_intf); + + return rv; +} + +static void free_recv_msg_list(struct list_head *q) +{ + struct list_head *entry, *entry2; + struct ipmi_recv_msg *msg; + + list_for_each_safe(entry, entry2, q) { + msg = list_entry(entry, struct ipmi_recv_msg, link); + list_del(entry); + ipmi_free_recv_msg(msg); + } +} + +static void free_cmd_rcvr_list(struct list_head *q) +{ + struct list_head *entry, *entry2; + struct cmd_rcvr *rcvr; + + list_for_each_safe(entry, entry2, q) { + rcvr = list_entry(entry, struct cmd_rcvr, link); + list_del(entry); + kfree(rcvr); + } +} + +static void clean_up_interface_data(ipmi_smi_t intf) +{ + int i; + + free_recv_msg_list(&(intf->waiting_msgs)); + free_recv_msg_list(&(intf->waiting_events)); + free_cmd_rcvr_list(&(intf->cmd_rcvrs)); + + for (i=0; i<IPMI_IPMB_NUM_SEQ; i++) { + if ((intf->seq_table[i].inuse) + && (intf->seq_table[i].recv_msg)) + { + ipmi_free_recv_msg(intf->seq_table[i].recv_msg); + } + } +} + +int ipmi_unregister_smi(ipmi_smi_t intf) +{ + int rv = -ENODEV; + unsigned long flags; + int i; + struct list_head *entry; + + spin_lock_irqsave(&interfaces_outside_lock, flags); + write_lock(&interfaces_lock); + + write_lock(&(intf->users_lock)); + if (list_empty(&(intf->users))) + { + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + if (ipmi_interfaces[i] == intf) { + clean_up_interface_data(intf); + ipmi_interfaces[i] = NULL; + write_unlock(&(intf->users_lock)); + kfree(intf); + MOD_DEC_USE_COUNT; + rv = 0; + goto out_call_watcher; + } + } + } else { + rv = -EBUSY; + } + write_unlock(&(intf->users_lock)); + spin_unlock_irqrestore(&interfaces_outside_lock, flags); + + return rv; + + out_call_watcher: + /* This unusual lock combination allows us to convert the + interfaces lock to a read lock atomically. This way, we + can call the callbacks with the new interface without + having to worry about the interface going away, but still + letting them register and unregister users. */ + write_unlock(&interfaces_lock); + read_lock(&interfaces_lock); + spin_unlock_irqrestore(&interfaces_outside_lock, flags); + + /* Call all the watcher interfaces to tell them that + an interface is gone. */ + read_lock(&smi_watcher_lock); + list_for_each(entry, &smi_watchers) { + struct ipmi_smi_watcher *w; + w = list_entry(entry, + struct ipmi_smi_watcher, + link); + w->smi_gone(i); + } + read_unlock(&smi_watcher_lock); + read_unlock(&interfaces_lock); + return 0; +} + +static int handle_get_msg_rsp(ipmi_smi_t intf, + struct ipmi_smi_msg *msg) +{ + struct ipmi_ipmb_addr ipmb_addr; + struct ipmi_recv_msg *recv_msg; + + + if (msg->rsp[2] != 0) { + /* An error getting the response, just ignore it. */ + return 0; + } + + ipmb_addr.addr_type = IPMI_IPMB_ADDR_TYPE; + ipmb_addr.slave_addr = msg->rsp[6]; + ipmb_addr.lun = msg->rsp[7] & 3; + + /* It's a response from a remote entity. Look up the sequence + number and handle the response. */ + if (intf_find_seq(intf, + msg->rsp[7] >> 2, + msg->rsp[3], + msg->rsp[8], + (msg->rsp[4] >> 2) & (~1), + (struct ipmi_addr *) &(ipmb_addr), + &recv_msg)) + { + /* We were unable to find the sequence number, + so just nuke the message. */ + return 0; + } + + memcpy(recv_msg->msg_data, + &(msg->rsp[9]), + msg->rsp_size - 9); + /* THe other fields matched, so no need to set them, except + for netfn, which needs to be the response that was + returned, not the request value. */ + recv_msg->msg.netfn = msg->rsp[4] >> 2; + recv_msg->msg.data = recv_msg->msg_data; + recv_msg->msg.data_len = msg->rsp_size - 9; + recv_msg->recv_type = IPMI_RESPONSE_RECV_TYPE; + deliver_response(recv_msg); + + return 0; +} + +static int handle_get_msg_cmd(ipmi_smi_t intf, + struct ipmi_smi_msg *msg) +{ + struct list_head *entry; + struct cmd_rcvr *rcvr; + int rv = 0; + unsigned char netfn; + unsigned char cmd; + ipmi_user_t user = NULL; + struct ipmi_ipmb_addr *ipmb_addr; + struct ipmi_recv_msg *recv_msg; + + if (msg->rsp[2] != 0) { + /* An error getting the response, just ignore it. */ + return 0; + } + + netfn = msg->rsp[4] >> 2; + cmd = msg->rsp[8]; + + read_lock(&(intf->cmd_rcvr_lock)); + + if (intf->all_cmd_rcvr) { + user = intf->all_cmd_rcvr; + } else { + /* Find the command/netfn. */ + list_for_each(entry, &(intf->cmd_rcvrs)) { + rcvr = list_entry(entry, struct cmd_rcvr, link); + if ((rcvr->netfn == netfn) || (rcvr->cmd == cmd)) { + user = rcvr->user; + break; + } + } + } + read_unlock(&(intf->cmd_rcvr_lock)); + + if (user == NULL) { + /* We didn't find a user, deliver an error response. */ + msg->data[0] = (IPMI_NETFN_APP_REQUEST << 2); + msg->data[1] = IPMI_SEND_MSG_CMD; + msg->data[2] = msg->rsp[3]; + msg->data[3] = msg->rsp[6]; + msg->data[4] = msg->rsp[4]; + msg->data[5] = ipmb_checksum(&(msg->data[3]), 2); + msg->data[6] = intf->my_address; + msg->data[7] = msg->rsp[7]; /* rqseq/lun */ + msg->data[8] = msg->rsp[8]; /* cmd */ + msg->data[9] = IPMI_INVALID_CMD_COMPLETION_CODE; + msg->data[10] = ipmb_checksum(&(msg->data[6]), 4); + msg->data_size = 11; + + intf->handlers->sender(intf->send_info, msg, 0); + + rv = -1; /* We used the message, so return the value that + causes it to not be freed or queued. */ + } else { + /* Deliver the message to the user. */ + recv_msg = ipmi_alloc_recv_msg(); + if (! recv_msg) { + /* We couldn't allocate memory for the + message, so requeue it for handling + later. */ + rv = 1; + } else { + ipmb_addr = (struct ipmi_ipmb_addr *) &recv_msg->addr; + ipmb_addr->addr_type = IPMI_IPMB_ADDR_TYPE; + ipmb_addr->slave_addr = msg->rsp[6]; + ipmb_addr->lun = msg->rsp[7] & 3; + ipmb_addr->channel = msg->rsp[3]; + + recv_msg->user = user; + recv_msg->recv_type = IPMI_CMD_RECV_TYPE; + recv_msg->msgid = msg->rsp[7] >> 2; + recv_msg->msg.netfn = msg->rsp[4] >> 2; + recv_msg->msg.cmd = msg->rsp[8]; + recv_msg->msg.data = recv_msg->msg_data; + recv_msg->msg.data_len = msg->rsp_size - 9; + memcpy(recv_msg->msg_data, + &(msg->rsp[9]), + msg->rsp_size - 9); + deliver_response(recv_msg); + } + } + + return rv; +} + +static void copy_event_into_recv_msg(struct ipmi_recv_msg *recv_msg, + struct ipmi_smi_msg *msg) +{ + struct ipmi_system_interface_addr *smi_addr; + + recv_msg->msgid = 0; + smi_addr = (struct ipmi_system_interface_addr *) &(recv_msg->addr); + smi_addr->addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; + smi_addr->channel = IPMI_BMC_CHANNEL; + smi_addr->lun = msg->rsp[0] & 3; + recv_msg->recv_type = IPMI_ASYNC_EVENT_RECV_TYPE; + recv_msg->msg.netfn = msg->rsp[0] >> 2; + recv_msg->msg.cmd = msg->rsp[1]; + memcpy(recv_msg->msg_data, &(msg->rsp[3]), msg->rsp_size - 3); + recv_msg->msg.data = recv_msg->msg_data; + recv_msg->msg.data_len = msg->rsp_size - 3; +} + +/* This will be called with the intf->users_lock read-locked, so no need + to do that here. */ +static int handle_read_event_rsp(ipmi_smi_t intf, + struct ipmi_smi_msg *msg) +{ + struct ipmi_recv_msg *recv_msg; + struct list_head msgs; + struct list_head *entry, *entry2; + ipmi_user_t user; + int rv = 0; + int deliver_count = 0; + unsigned long flags; + + if (msg->rsp_size < 19) { + /* Message is too small to be an IPMB event. */ + return 0; + } + + if (msg->rsp[2] != 0) { + /* An error getting the event, just ignore it. */ + return 0; + } + + INIT_LIST_HEAD(&msgs); + + spin_lock_irqsave(&(intf->events_lock), flags); + + /* Allocate and fill in one message for every user that is getting + events. */ + list_for_each(entry, &(intf->users)) { + user = list_entry(entry, struct ipmi_user, link); + + if (! user->gets_events) + continue; + + recv_msg = ipmi_alloc_recv_msg(); + if (! recv_msg) { + list_for_each_safe(entry, entry2, &msgs) { + recv_msg = list_entry(entry, + struct ipmi_recv_msg, + link); + list_del(entry); + ipmi_free_recv_msg(recv_msg); + } + /* We couldn't allocate memory for the + message, so requeue it for handling + later. */ + rv = 1; + goto out; + } + + deliver_count++; + + copy_event_into_recv_msg(recv_msg, msg); + recv_msg->user = user; + list_add_tail(&(recv_msg->link), &msgs); + } + + if (deliver_count) { + /* Now deliver all the messages. */ + list_for_each_safe(entry, entry2, &msgs) { + recv_msg = list_entry(entry, + struct ipmi_recv_msg, + link); + list_del(entry); + deliver_response(recv_msg); + } + } else if (intf->waiting_events_count < MAX_EVENTS_IN_QUEUE) { + /* No one to receive the message, put it in queue if there's + not already too many things in the queue. */ + recv_msg = ipmi_alloc_recv_msg(); + if (! recv_msg) { + /* We couldn't allocate memory for the + message, so requeue it for handling + later. */ + rv = 1; + goto out; + } + + copy_event_into_recv_msg(recv_msg, msg); + list_add_tail(&(recv_msg->link), &(intf->waiting_events)); + } else { + /* There's too many things in the queue, discard this + message. */ + printk(KERN_WARNING "ipmi: Event queue full, discarding an" + " incoming event\n"); + } + + out: + spin_unlock_irqrestore(&(intf->events_lock), flags); + + return rv; +} + +static int handle_bmc_rsp(ipmi_smi_t intf, + struct ipmi_smi_msg *msg) +{ + struct ipmi_recv_msg *recv_msg; + int found = 0; + struct list_head *entry; + + recv_msg = (struct ipmi_recv_msg *) msg->user_data; + + /* Make sure the user still exists. */ + list_for_each(entry, &(intf->users)) { + if (list_entry(entry, struct ipmi_user, link) + == recv_msg->user) + { + /* Found it, so we can deliver it */ + found = 1; + break; + } + } + + if (!found) { + /* The user for the message went away, so give up. */ + ipmi_free_recv_msg(recv_msg); + } else { + struct ipmi_system_interface_addr *smi_addr; + + recv_msg->recv_type = IPMI_RESPONSE_RECV_TYPE; + recv_msg->msgid = msg->msgid; + smi_addr = ((struct ipmi_system_interface_addr *) + &(recv_msg->addr)); + smi_addr->addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; + smi_addr->channel = IPMI_BMC_CHANNEL; + smi_addr->lun = msg->rsp[0] & 3; + recv_msg->msg.netfn = msg->rsp[0] >> 2; + recv_msg->msg.cmd = msg->rsp[1]; + memcpy(recv_msg->msg_data, + &(msg->rsp[2]), + msg->rsp_size - 2); + recv_msg->msg.data = recv_msg->msg_data; + recv_msg->msg.data_len = msg->rsp_size - 2; + deliver_response(recv_msg); + } + + return 0; +} + +/* Handle a new message. Return 1 if the message should be requeued, + 0 if the message should be freed, or -1 if the message should not + be freed or requeued. */ +static int handle_new_recv_msg(ipmi_smi_t intf, + struct ipmi_smi_msg *msg) +{ + int requeue; + + if (msg->rsp_size < 2) { + /* Message is too small to be correct. */ + requeue = 0; + } else if (msg->rsp[1] == IPMI_GET_MSG_CMD) { + /* It's from the receive queue. */ + if (msg->rsp_size < 11) { + /* It's too small to be valid, just ignore it. */ + requeue = 0; + } else if (msg->rsp[4] & 0x04) { + /* It's a response, so find the requesting message + and send it up. */ + requeue = handle_get_msg_rsp(intf, msg); + } else { + /* It's a command to the SMS from some other + entity. Handle that. */ + requeue = handle_get_msg_cmd(intf, msg); + } + } else if (msg->rsp[1] == IPMI_READ_EVENT_MSG_BUFFER_CMD) { + /* It's an asyncronous event. */ + requeue = handle_read_event_rsp(intf, msg); + } else { + /* It's a response from the local BMC. */ + requeue = handle_bmc_rsp(intf, msg); + } + + return requeue; +} + +/* Handle a new message from the lower layer. */ +void ipmi_smi_msg_received(ipmi_smi_t intf, + struct ipmi_smi_msg *msg) +{ + unsigned long flags; + int rv; + + + if ((msg->data_size >= 2) && (msg->data[1] == IPMI_SEND_MSG_CMD)) { + /* This is the local response to a send, we just + ignore these. */ + msg->done(msg); + return; + } + + /* Lock the user lock so the user can't go away while we are + working on it. */ + read_lock(&(intf->users_lock)); + + /* To preserve message order, if the list is not empty, we + tack this message onto the end of the list. */ + spin_lock_irqsave(&(intf->waiting_msgs_lock), flags); + if (!list_empty(&(intf->waiting_msgs))) { + list_add_tail(&(msg->link), &(intf->waiting_msgs)); + spin_unlock(&(intf->waiting_msgs_lock)); + return; + } + spin_unlock_irqrestore(&(intf->waiting_msgs_lock), flags); + + rv = handle_new_recv_msg(intf, msg); + if (rv > 0) { + /* Could not handle the message now, just add it to a + list to handle later. */ + spin_lock(&(intf->waiting_msgs_lock)); + list_add_tail(&(msg->link), &(intf->waiting_msgs)); + spin_unlock(&(intf->waiting_msgs_lock)); + } else if (rv == 0) { + msg->done(msg); + } + + read_unlock(&(intf->users_lock)); +} + +void ipmi_smi_watchdog_pretimeout(ipmi_smi_t intf) +{ + struct list_head *entry; + ipmi_user_t user; + + read_lock(&(intf->users_lock)); + list_for_each(entry, &(intf->users)) { + user = list_entry(entry, struct ipmi_user, link); + + if (! user->handler->ipmi_watchdog_pretimeout) + continue; + + user->handler->ipmi_watchdog_pretimeout(user->handler_data); + } + read_unlock(&(intf->users_lock)); +} + +static void +handle_msg_timeout(struct ipmi_recv_msg *msg) +{ + msg->recv_type = IPMI_RESPONSE_RECV_TYPE; + msg->msg_data[0] = IPMI_TIMEOUT_COMPLETION_CODE; + msg->msg.netfn |= 1; /* Convert to a response. */ + msg->msg.data_len = 1; + msg->msg.data = msg->msg_data; + deliver_response(msg); +} + +static void +ipmi_timeout_handler(long timeout_period) +{ + ipmi_smi_t intf; + struct list_head timeouts; + struct ipmi_recv_msg *msg; + struct ipmi_smi_msg *smi_msg; + unsigned long flags; + struct list_head *entry, *entry2; + int i, j; + + INIT_LIST_HEAD(&timeouts); + + read_lock(&interfaces_lock); + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + intf = ipmi_interfaces[i]; + if (intf == NULL) + continue; + + read_lock(&(intf->users_lock)); + + /* See if any waiting messages need to be processed. */ + spin_lock_irqsave(&(intf->waiting_msgs_lock), flags); + list_for_each_safe(entry, entry2, &(intf->waiting_msgs)) { + smi_msg = list_entry(entry, struct ipmi_smi_msg, link); + if (! handle_new_recv_msg(intf, smi_msg)) { + list_del(entry); + smi_msg->done(smi_msg); + } else { + /* To preserve message order, quit if we + can't handle a message. */ + break; + } + } + spin_unlock_irqrestore(&(intf->waiting_msgs_lock), flags); + + /* Go through the seq table and find any messages that + have timed out, putting them in the timeouts + list. */ + spin_lock_irqsave(&(intf->seq_lock), flags); + for (j=0; j<IPMI_IPMB_NUM_SEQ; j++) { + if (intf->seq_table[j].inuse) { + intf->seq_table[j].timeout -= timeout_period; + if (intf->seq_table[j].timeout <= 0) { + intf->seq_table[j].inuse = 0; + msg = intf->seq_table[j].recv_msg; + list_add_tail(&(msg->link), &timeouts); + } + } + } + spin_unlock_irqrestore(&(intf->seq_lock), flags); + + list_for_each_safe(entry, entry2, &timeouts) { + msg = list_entry(entry, struct ipmi_recv_msg, link); + handle_msg_timeout(msg); + } + + read_unlock(&(intf->users_lock)); + } + + read_unlock(&interfaces_lock); +} + +static void ipmi_request_event(void) +{ + ipmi_smi_t intf; + int i; + + read_lock(&interfaces_lock); + + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + intf = ipmi_interfaces[i]; + if (intf == NULL) + continue; + + intf->handlers->request_events(intf->send_info); + } + + read_unlock(&interfaces_lock); +} + +static struct timer_list ipmi_timer; + +/* Call every 100 ms. */ +#define IPMI_TIMEOUT_TIME 100 +#define IPMI_TIMEOUT_JIFFIES (IPMI_TIMEOUT_TIME/(1000/HZ)) + +/* Request events from the queue every second. Hopefully, in the + future, IPMI will add a way to know immediately if an event is + in the queue. */ +#define IPMI_REQUEST_EV_TIME (1000 / (IPMI_TIMEOUT_TIME)) + +static volatile int stop_operation = 0; +static volatile int timer_stopped = 0; +static unsigned int ticks_to_req_ev = IPMI_REQUEST_EV_TIME; + +static void ipmi_timeout(unsigned long data) +{ + if (stop_operation) { + timer_stopped = 1; + return; + } + + ticks_to_req_ev--; + if (ticks_to_req_ev == 0) { + ipmi_request_event(); + ticks_to_req_ev = IPMI_REQUEST_EV_TIME; + } + + ipmi_timeout_handler(IPMI_TIMEOUT_TIME); + + ipmi_timer.expires += IPMI_TIMEOUT_JIFFIES; + add_timer(&ipmi_timer); +} + + +/* FIXME - convert these to slabs. */ +static void free_smi_msg(struct ipmi_smi_msg *msg) +{ + kfree(msg); +} + +struct ipmi_smi_msg *ipmi_alloc_smi_msg(void) +{ + struct ipmi_smi_msg *rv; + rv = kmalloc(sizeof(struct ipmi_smi_msg), GFP_ATOMIC); + if (rv) + rv->done = free_smi_msg; + return rv; +} + +static void free_recv_msg(struct ipmi_recv_msg *msg) +{ + kfree(msg); +} + +struct ipmi_recv_msg *ipmi_alloc_recv_msg(void) +{ + struct ipmi_recv_msg *rv; + + rv = kmalloc(sizeof(struct ipmi_recv_msg), GFP_ATOMIC); + if (rv) + rv->done = free_recv_msg; + return rv; +} + +#ifdef CONFIG_IPMI_PANIC_EVENT + +static void dummy_smi_done_handler(struct ipmi_smi_msg *msg) +{ +} + +static void dummy_recv_done_handler(struct ipmi_recv_msg *msg) +{ +} + +static void send_panic_events(void) +{ + struct ipmi_msg msg; + ipmi_smi_t intf; + unsigned char data[8]; + int i; + struct ipmi_system_interface_addr addr; + struct ipmi_smi_msg smi_msg; + struct ipmi_recv_msg recv_msg; + + addr.addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; + addr.channel = IPMI_BMC_CHANNEL; + + /* Fill in an event telling that we have failed. */ + msg.netfn = 0x04; /* Sensor or Event. */ + msg.cmd = 2; /* Platform event command. */ + msg.data = data; + msg.data_len = 8; + data[0] = 0x21; /* Kernel generator ID, IPMI table 5-4 */ + data[1] = 0x03; /* This is for IPMI 1.0. */ + data[2] = 0x20; /* OS Critical Stop, IPMI table 36-3 */ + data[4] = 0x6f; /* Sensor specific, IPMI table 36-1 */ + data[5] = 0xa1; /* Runtime stop OEM bytes 2 & 3. */ + + /* These used to have the first three bytes of the panic string, + but not only is that not terribly useful, it's not available + any more. */ + data[3] = 0; + data[6] = 0; + data[7] = 0; + + smi_msg.done = dummy_smi_done_handler; + recv_msg.done = dummy_recv_done_handler; + + /* For every registered interface, send the event. */ + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + intf = ipmi_interfaces[i]; + if (intf == NULL) + continue; + + intf->handlers->set_run_to_completion(intf->send_info, 1); + i_ipmi_request(NULL, + intf, + (struct ipmi_addr *) &addr, + 0, + &msg, + &smi_msg, + &recv_msg, + 0, + intf->my_address, + intf->my_lun); + } +} +#endif /* CONFIG_IPMI_PANIC_EVENT */ + +static int has_paniced = 0; + +static int panic_event(struct notifier_block *this, + unsigned long event, + void *ptr) +{ + int i; + ipmi_smi_t intf; + + if (has_paniced) + return NOTIFY_DONE; + has_paniced = 1; + + /* For every registered interface, set it to run to completion. */ + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + intf = ipmi_interfaces[i]; + if (intf == NULL) + continue; + + intf->handlers->set_run_to_completion(intf->send_info, 1); + } + +#ifdef CONFIG_IPMI_PANIC_EVENT + send_panic_events(); +#endif + + return NOTIFY_DONE; +} + +static struct notifier_block panic_block = { + panic_event, + NULL, + 200 /* priority: INT_MAX >= x >= 0 */ +}; + + +static int ipmi_init_msghandler(void) +{ + int i; + + if (initialized) + return 0; + + for (i=0; i<MAX_IPMI_INTERFACES; i++) { + ipmi_interfaces[i] = NULL; + } + + init_timer(&ipmi_timer); + ipmi_timer.data = 0; + ipmi_timer.function = ipmi_timeout; + ipmi_timer.expires = jiffies + IPMI_TIMEOUT_JIFFIES; + add_timer(&ipmi_timer); + + notifier_chain_register(&panic_notifier_list, &panic_block); + + initialized = 1; + + printk(KERN_INFO "ipmi: message handler initialized\n"); + + return 0; +} + +static __exit void cleanup_ipmi(void) +{ + if (!initialized) + return; + + notifier_chain_unregister(&panic_notifier_list, &panic_block); + + /* This can't be called if any interfaces exist, so no worry about + shutting down the interfaces. */ + + /* Tell the timer to stop, then wait for it to stop. This avoids + problems with race conditions removing the timer here. */ + stop_operation = 1; + while (!timer_stopped) { + schedule_timeout(1); + } + + initialized = 0; +} +module_exit(cleanup_ipmi); + +module_init(ipmi_init_msghandler); +MODULE_LICENSE("GPL"); + +EXPORT_SYMBOL(ipmi_alloc_recv_msg); +EXPORT_SYMBOL(ipmi_create_user); +EXPORT_SYMBOL(ipmi_destroy_user); +EXPORT_SYMBOL(ipmi_get_version); +EXPORT_SYMBOL(ipmi_request); +EXPORT_SYMBOL(ipmi_request_supply_msgs); +EXPORT_SYMBOL(ipmi_request_with_source); +EXPORT_SYMBOL(ipmi_register_smi); +EXPORT_SYMBOL(ipmi_unregister_smi); +EXPORT_SYMBOL(ipmi_register_for_cmd); +EXPORT_SYMBOL(ipmi_unregister_for_cmd); +EXPORT_SYMBOL(ipmi_smi_msg_received); +EXPORT_SYMBOL(ipmi_smi_watchdog_pretimeout); +EXPORT_SYMBOL(ipmi_alloc_smi_msg); +EXPORT_SYMBOL(ipmi_register_all_cmd_rcvr); +EXPORT_SYMBOL(ipmi_unregister_all_cmd_rcvr); +EXPORT_SYMBOL(ipmi_addr_length); +EXPORT_SYMBOL(ipmi_validate_addr); +EXPORT_SYMBOL(ipmi_set_gets_events); +EXPORT_SYMBOL(ipmi_addr_equal); +EXPORT_SYMBOL(ipmi_smi_watcher_register); +EXPORT_SYMBOL(ipmi_smi_watcher_unregister); +EXPORT_SYMBOL(ipmi_set_my_address); +EXPORT_SYMBOL(ipmi_get_my_address); +EXPORT_SYMBOL(ipmi_set_my_LUN); +EXPORT_SYMBOL(ipmi_get_my_LUN); diff -urN linux.orig/drivers/char/ipmi/ipmi_watchdog.c linux/drivers/char/ipmi/ipmi_watchdog.c --- linux.orig/drivers/char/ipmi/ipmi_watchdog.c Wed Dec 31 18:00:00 1969 +++ linux/drivers/char/ipmi/ipmi_watchdog.c Mon Oct 28 16:34:03 2002 @@ -0,0 +1,863 @@ +/* + * ipmi_watchdog.c + * + * A watchdog timer based upon the IPMI interface. + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include <linux/config.h> +#include <linux/module.h> +#include <linux/ipmi.h> +#include <linux/ipmi_smi.h> +#include <linux/watchdog.h> +#include <linux/miscdevice.h> +#include <linux/init.h> +#include <linux/spinlock.h> +#include <linux/errno.h> +#include <asm/uaccess.h> +#include <linux/notifier.h> +#include <linux/nmi.h> +#include <linux/reboot.h> + +/* + * The IPMI command/response information for the watchdog timer. + */ + +/* values for byte 1 of the set command, byte 2 of the get response. */ +#define WDOG_DONT_LOG (1 << 7) +#define WDOG_DONT_STOP_ON_SET (1 << 6) +#define WDOG_SET_TIMER_USE(byte, use) \ + byte = ((byte) & 0xf8) | ((use) & 0x7) +#define WDOG_GET_TIMER_USE(byte) ((byte) & 0x7) +#define WDOG_TIMER_USE_BIOS_FRB2 1 +#define WDOG_TIMER_USE_BIOS_POST 2 +#define WDOG_TIMER_USE_OS_LOAD 3 +#define WDOG_TIMER_USE_SMS_OS 4 +#define WDOG_TIMER_USE_OEM 5 + +/* values for byte 2 of the set command, byte 3 of the get response. */ +#define WDOG_SET_PRETIMEOUT_ACT(byte, use) \ + byte = ((byte) & 0x8f) | (((use) & 0x7) << 4) +#define WDOG_GET_PRETIMEOUT_ACT(byte) (((byte) >> 4) & 0x7) +#define WDOG_PRETIMEOUT_NONE 0 +#define WDOG_PRETIMEOUT_SMI 1 +#define WDOG_PRETIMEOUT_NMI 2 +#define WDOG_PRETIMEOUT_MSG_INT 3 + +#define WDOG_SET_TIMEOUT_ACT(byte, use) \ + byte = ((byte) & 0xf8) | ((use) & 0x7) +#define WDOG_GET_TIMEOUT_ACT(byte) ((byte) & 0x7) +#define WDOG_TIMEOUT_NONE 0 +#define WDOG_TIMEOUT_RESET 1 +#define WDOG_TIMEOUT_POWER_DOWN 2 +#define WDOG_TIMEOUT_POWER_CYCLE 3 + +/* Byte 3 of the get command, byte 4 of the get response is the + pre-timeout in seconds. */ + +/* Bits for setting byte 4 of the set command, byte 5 of the get response. */ +#define WDOG_EXPIRE_CLEAR_BIOS_FRB2 (1 << 1) +#define WDOG_EXPIRE_CLEAR_BIOS_POST (1 << 2) +#define WDOG_EXPIRE_CLEAR_OS_LOAD (1 << 3) +#define WDOG_EXPIRE_CLEAR_SMS_OS (1 << 4) +#define WDOG_EXPIRE_CLEAR_OEM (1 << 5) + +/* Setting/getting the watchdog timer value. This is for bytes 5 and + 6 (the timeout time) of the set command, and bytes 6 and 7 (the + timeout time) and 8 and 9 (the current countdown value) of the + response. The timeout value is given in seconds (in the command it + is 100ms intervals). */ +#define WDOG_SET_TIMEOUT(byte1, byte2, val) \ + (byte1) = (((val) * 10) & 0xff), (byte2) = (((val) * 10) >> 8) +#define WDOG_GET_TIMEOUT(byte1, byte2) \ + (((byte1) | ((byte2) << 8)) / 10) + +#define IPMI_WDOG_RESET_TIMER 0x22 +#define IPMI_WDOG_SET_TIMER 0x24 +#define IPMI_WDOG_GET_TIMER 0x25 + +/* These are here until the real ones get into the watchdog.h interface. */ +#ifndef WDIOC_GETTIMEOUT +#define WDIOC_GETTIMEOUT _IOW(WATCHDOG_IOCTL_BASE, 20, int) +#endif +#ifndef WDIOC_SET_PRETIMEOUT +#define WDIOC_SET_PRETIMEOUT _IOW(WATCHDOG_IOCTL_BASE, 21, int) +#endif +#ifndef WDIOC_GET_PRETIMEOUT +#define WDIOC_GET_PRETIMEOUT _IOW(WATCHDOG_IOCTL_BASE, 22, int) +#endif + +static ipmi_user_t watchdog_user = NULL; + +/* Default the timeout to 10 seconds. */ +static int timeout = 10; + +/* The pre-timeout is disabled by default. */ +static int pretimeout = 0; + +/* Default action is to reset the board on a timeout. */ +static unsigned char action_val = WDOG_TIMEOUT_RESET; + +static char *action = "reset"; + +static unsigned char preaction_val = WDOG_PRETIMEOUT_NONE; + +static char *preaction = "none"; + +MODULE_PARM(timeout, "i"); +MODULE_PARM(pretimeout, "i"); +MODULE_PARM(action, "s"); +MODULE_PARM(preaction, "s"); + +/* Default state of the timer. */ +static unsigned char ipmi_watchdog_state = WDOG_TIMEOUT_NONE; + +/* If shutting down via IPMI, we ignore the heartbeat. */ +static int ipmi_ignore_heartbeat = 0; + +/* Is someone using the watchdog? Only one user is allowed. */ +static int ipmi_wdog_open = 0; + +/* If true, the driver will start running as soon as it is configured + and ready. */ +static int start_now = 0; + +/* If set to 1, the heartbeat command will set the state to reset and + start the timer. The timer doesn't normally run when the driver is + first opened until the heartbeat is set the first time, this + variable is used to accomplish this. */ +static int ipmi_start_timer_on_heartbeat = 0; + +/* IPMI version of the BMC. */ +static unsigned char ipmi_version_major; +static unsigned char ipmi_version_minor; + + +static int ipmi_heartbeat(void); + + +/* We use a semaphore to make sure that only one thing can send a set + timeout at one time, because we only have one copy of the data. + The semaphore is claimed when the set_timeout is sent and freed + when both messages are free. */ +static atomic_t set_timeout_tofree = ATOMIC_INIT(0); +static DECLARE_MUTEX(set_timeout_lock); +static void set_timeout_free_smi(struct ipmi_smi_msg *msg) +{ + if (atomic_dec_and_test(&set_timeout_tofree)) + up(&set_timeout_lock); +} +static void set_timeout_free_recv(struct ipmi_recv_msg *msg) +{ + if (atomic_dec_and_test(&set_timeout_tofree)) + up(&set_timeout_lock); +} +static struct ipmi_smi_msg set_timeout_smi_msg = +{ + .done = set_timeout_free_smi +}; +static struct ipmi_recv_msg set_timeout_recv_msg = +{ + .done = set_timeout_free_recv +}; + +static int ipmi_set_timeout(void) +{ + struct ipmi_msg msg; + unsigned char data[6]; + int rv; + struct ipmi_system_interface_addr addr; + int send_heartbeat_now = 0; + + + /* We can only send one of these at a time. */ + down(&set_timeout_lock); + + atomic_set(&set_timeout_tofree, 2); + + data[0] = 0; + WDOG_SET_TIMER_USE(data[0], WDOG_TIMER_USE_SMS_OS); + + if ((ipmi_version_major > 1) + || ((ipmi_version_major == 1) && (ipmi_version_minor >= 5))) + { + /* This is an IPMI 1.5-only feature. */ + data[0] |= WDOG_DONT_STOP_ON_SET; + } else if (ipmi_watchdog_state != WDOG_TIMEOUT_NONE) { + /* In ipmi 1.0, setting the timer stops the watchdog, we + need to start it back up again. */ + send_heartbeat_now = 1; + } + + data[1] = 0; + WDOG_SET_TIMEOUT_ACT(data[1], ipmi_watchdog_state); + if (pretimeout > 0) { + WDOG_SET_PRETIMEOUT_ACT(data[1], preaction_val); + data[2] = pretimeout; + } else { + WDOG_SET_PRETIMEOUT_ACT(data[1], WDOG_PRETIMEOUT_NONE); + data[2] = 0; /* No pretimeout. */ + } + data[3] = 0; + WDOG_SET_TIMEOUT(data[4], data[5], timeout); + + addr.addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; + addr.channel = IPMI_BMC_CHANNEL; + addr.lun = 0; + + msg.netfn = 0x06; + msg.cmd = IPMI_WDOG_SET_TIMER; + msg.data = data; + msg.data_len = sizeof(data); + rv = ipmi_request_supply_msgs(watchdog_user, + (struct ipmi_addr *) &addr, + 0, + &msg, + &set_timeout_smi_msg, + &set_timeout_recv_msg, + 1); + if (rv) { + up(&set_timeout_lock); + printk(KERN_WARNING "IPMI Watchdog, set timeout error: %d\n", + rv); + } else { + if (send_heartbeat_now) + rv = ipmi_heartbeat(); + } + + return rv; +} + +/* Do a delayed shutdown, with the delay in milliseconds. If power_off is + false, do a reset. If power_off is true, do a power down. This is + primarily for the IMB code's shutdown. */ +void ipmi_delayed_shutdown(long delay, int power_off) +{ + ipmi_ignore_heartbeat = 1; + if (power_off) + ipmi_watchdog_state = WDOG_TIMEOUT_POWER_DOWN; + else + ipmi_watchdog_state = WDOG_TIMEOUT_RESET; + timeout = delay; + ipmi_set_timeout(); +} + +/* We use a semaphore to make sure that only one thing can send a + heartbeat at one time, because we only have one copy of the data. + The semaphore is claimed when the set_timeout is sent and freed + when both messages are free. */ +static atomic_t heartbeat_tofree = ATOMIC_INIT(0); +static DECLARE_MUTEX(heartbeat_lock); +static DECLARE_MUTEX_LOCKED(heartbeat_wait_lock); +static void heartbeat_free_smi(struct ipmi_smi_msg *msg) +{ + if (atomic_dec_and_test(&heartbeat_tofree)) + up(&heartbeat_wait_lock); +} +static void heartbeat_free_recv(struct ipmi_recv_msg *msg) +{ + if (atomic_dec_and_test(&heartbeat_tofree)) + up(&heartbeat_wait_lock); +} +static struct ipmi_smi_msg heartbeat_smi_msg = +{ + .done = heartbeat_free_smi +}; +static struct ipmi_recv_msg heartbeat_recv_msg = +{ + .done = heartbeat_free_recv +}; + +static int ipmi_heartbeat(void) +{ + struct ipmi_msg msg; + int rv; + struct ipmi_system_interface_addr addr; + + if (ipmi_ignore_heartbeat) { + return 0; + } + + if (ipmi_start_timer_on_heartbeat) { + ipmi_start_timer_on_heartbeat = 0; + ipmi_watchdog_state = action_val; + return ipmi_set_timeout(); + } + + down(&heartbeat_lock); + + atomic_set(&heartbeat_tofree, 2); + + /* Don't reset the timer if we have the timer turned off, that + re-enables the watchdog. */ + if (ipmi_watchdog_state == WDOG_TIMEOUT_NONE) { + up(&heartbeat_lock); + return 0; + } + + addr.addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; + addr.channel = IPMI_BMC_CHANNEL; + addr.lun = 0; + + msg.netfn = 0x06; + msg.cmd = IPMI_WDOG_RESET_TIMER; + msg.data = NULL; + msg.data_len = 0; + rv = ipmi_request_supply_msgs(watchdog_user, + (struct ipmi_addr *) &addr, + 0, + &msg, + &heartbeat_smi_msg, + &heartbeat_recv_msg, + 1); + if (rv) { + up(&heartbeat_lock); + printk(KERN_WARNING "IPMI Watchdog, heartbeat failure: %d\n", + rv); + return rv; + } + + /* Wait for the heartbeat to be sent. */ + down(&heartbeat_wait_lock); + + if (heartbeat_recv_msg.msg.data[0] != 0) { + /* Got an error in the heartbeat response. It was already + reported in ipmi_wdog_msg_handler, but we should return + an error here. */ + rv = -EINVAL; + } + + up(&heartbeat_lock); + + return rv; +} + +static struct watchdog_info ident= +{ + 0, /* WDIOF_SETTIMEOUT, */ + 1, + "IPMI" +}; + +static int ipmi_ioctl(struct inode *inode, struct file *file, + unsigned int cmd, unsigned long arg) +{ + int i; + int val; + + switch(cmd) { + case WDIOC_GETSUPPORT: + i = copy_to_user((void*)arg, &ident, sizeof(ident)); + return i ? -EFAULT : 0; + + case WDIOC_SETTIMEOUT: + i = copy_from_user(&val, (void *) arg, sizeof(int)); + if (i) + return -EFAULT; + timeout = val; + return ipmi_set_timeout(); + + case WDIOC_GETTIMEOUT: + i = copy_to_user((void *) arg, + &timeout, + sizeof(timeout)); + if (i) + return -EFAULT; + return 0; + + case WDIOC_SET_PRETIMEOUT: + i = copy_from_user(&val, (void *) arg, sizeof(int)); + if (i) + return -EFAULT; + pretimeout = val; + return ipmi_set_timeout(); + + case WDIOC_GET_PRETIMEOUT: + i = copy_to_user((void *) arg, + &pretimeout, + sizeof(pretimeout)); + if (i) + return -EFAULT; + return 0; + + case WDIOC_KEEPALIVE: + return ipmi_heartbeat(); + + case WDIOC_SETOPTIONS: + i = copy_from_user(&val, (void *) arg, sizeof(int)); + if (i) + return -EFAULT; + if (val & WDIOS_DISABLECARD) + { + ipmi_watchdog_state = WDOG_TIMEOUT_NONE; + ipmi_set_timeout(); + ipmi_start_timer_on_heartbeat = 0; + } + + if (val & WDIOS_ENABLECARD) + { + ipmi_watchdog_state = action_val; + ipmi_set_timeout(); + } + return 0; + + case WDIOC_GETSTATUS: + val = 0; + return copy_to_user((void *) arg, &val, sizeof(val)); + + default: + return -ENOIOCTLCMD; + } +} + +static ssize_t ipmi_write(struct file *file, + const char *buf, + size_t len, + loff_t *ppos) +{ + int rv; + + /* Can't seek (pwrite) on this device */ + if (ppos != &file->f_pos) + return -ESPIPE; + + if (len) { + rv = ipmi_heartbeat(); + if (rv) + return rv; + return 1; + } + return 0; +} + +static ssize_t ipmi_read(struct file *file, + char *buf, + size_t count, + loff_t *ppos) +{ + /* Can't seek (pread) on this device */ + if (ppos != &file->f_pos) + return -ESPIPE; + + /* Also can't read it. */ + return -EINVAL; +} + +static int ipmi_open(struct inode *ino, struct file *filep) +{ + switch (minor(ino->i_rdev)) + { + case WATCHDOG_MINOR: + if (ipmi_wdog_open) + return -EBUSY; + + MOD_INC_USE_COUNT; + ipmi_wdog_open = 1; + + /* Don't start the timer now, let it start on the + first heartbeat. */ + ipmi_start_timer_on_heartbeat = 1; + return(0); + + default: + return (-ENODEV); + } +} + +static int ipmi_close(struct inode *ino, struct file *filep) +{ + if (minor(ino->i_rdev)==WATCHDOG_MINOR) + { +#ifndef CONFIG_WATCHDOG_NOWAYOUT + ipmi_watchdog_state = WDOG_TIMEOUT_NONE; + ipmi_set_timeout(); +#endif + ipmi_wdog_open = 0; + MOD_DEC_USE_COUNT; + } + return 0; +} + +static struct file_operations ipmi_wdog_fops = { + .owner = THIS_MODULE, + .read = ipmi_read, + .write = ipmi_write, + .ioctl = ipmi_ioctl, + .open = ipmi_open, + .release = ipmi_close, +}; + +static struct miscdevice ipmi_wdog_miscdev = { + WATCHDOG_MINOR, + "watchdog", + &ipmi_wdog_fops +}; + +static spinlock_t register_lock = SPIN_LOCK_UNLOCKED; + +static void ipmi_wdog_msg_handler(struct ipmi_recv_msg *msg, + void *handler_data) +{ + if (msg->msg.data[0] != 0) { + printk(KERN_ERR "IPMI Watchdog response: Error %x on cmd %x\n", + msg->msg.data[0], + msg->msg.cmd); + } + + ipmi_free_recv_msg(msg); +} + +static void ipmi_wdog_pretimeout_handler(void *handler_data) +{ + panic("Watchdog pre-timeout"); +} + +static struct ipmi_user_hndl ipmi_hndlrs = +{ + .ipmi_recv_hndl = ipmi_wdog_msg_handler, + .ipmi_watchdog_pretimeout = ipmi_wdog_pretimeout_handler +}; + +static void ipmi_register_watchdog(int ipmi_intf) +{ + unsigned long flags; + int rv = -EBUSY; + + spin_lock_irqsave(®ister_lock, flags); + if (watchdog_user) + goto out; + + rv = ipmi_create_user(ipmi_intf, &ipmi_hndlrs, NULL, &watchdog_user); + if (rv < 0) { + printk("IPMI watchdog: Unable to register with ipmi\n"); + goto out; + } + + ipmi_get_version(watchdog_user, + &ipmi_version_major, + &ipmi_version_minor); + + rv = misc_register(&ipmi_wdog_miscdev); + if (rv < 0) { + ipmi_destroy_user(watchdog_user); + watchdog_user = NULL; + printk("IPMI watchdog: Unable to register misc device\n"); + } + + out: + spin_unlock_irqrestore(®ister_lock, flags); + + if ((start_now) && (rv == 0)) { + /* Run from startup, so start the timer now. */ + start_now = 0; /* Disable this function after first startup. */ + ipmi_watchdog_state = action_val; + ipmi_set_timeout(); + printk("Starting IPMI Watchdog now!\n"); + } +} + +#ifdef HAVE_NMI_HANDLER +static int +ipmi_nmi(void *dev_id, struct pt_regs *regs, int cpu, int handled) +{ + /* If no one else handled the NMI, we assume it was the IPMI + watchdog. */ + if (!handled) + panic("IPMI watchdog pre-timeout"); + return NOTIFY_DONE; +} + +static struct nmi_handler ipmi_nmi_handler = +{ + .link = LIST_HEAD_INIT(ipmi_nmi_handler.link), + .dev_name = "ipmi_watchdog", + .dev_id = NULL, + .handler = ipmi_nmi, + .priority = 0, /* Call us last. */ +}; +#endif + +static int wdog_reboot_handler(struct notifier_block *this, + unsigned long code, + void *unused) +{ + static int reboot_event_handled = 0; + + if ((watchdog_user) && (!reboot_event_handled)) { + /* Make sure we only do this once. */ + reboot_event_handled = 1; + + if (code == SYS_DOWN || code == SYS_HALT) { + /* Disable the WDT if we are shutting down. */ + ipmi_watchdog_state = WDOG_TIMEOUT_NONE; + ipmi_set_timeout(); + } else { + /* Set a long timer to let the reboot happens, but + reboot if it hangs. */ + timeout = 120; + pretimeout = 0; + ipmi_watchdog_state = WDOG_TIMEOUT_RESET; + ipmi_set_timeout(); + } + } + return NOTIFY_OK; +} + +static struct notifier_block wdog_reboot_notifier = { + wdog_reboot_handler, + NULL, + 0 +}; + +extern int panic_timeout; /* Why isn't this defined anywhere? */ + +static int wdog_panic_handler(struct notifier_block *this, + unsigned long event, + void *unused) +{ + static int panic_event_handled = 0; + + /* On a panic, if we have a panic timeout, make sure that the thing + reboots, even if it hangs during that panic. */ + if (watchdog_user && !panic_event_handled && (panic_timeout > 0)) { + /* Make sure the panic doesn't hang, and make sure we + do this only once. */ + panic_event_handled = 1; + + timeout = panic_timeout + 120; + if (timeout > 255) + timeout = 255; + pretimeout = 0; + ipmi_watchdog_state = WDOG_TIMEOUT_RESET; + ipmi_set_timeout(); + } + + return NOTIFY_OK; +} + +static struct notifier_block wdog_panic_notifier = { + wdog_panic_handler, + NULL, + 150 /* priority: INT_MAX >= x >= 0 */ +}; + + +static void ipmi_new_smi(int if_num) +{ + ipmi_register_watchdog(if_num); +} + +static void ipmi_smi_gone(int if_num) +{ + /* This can never be called, because once the watchdog is + registered, the interface can't go away until the watchdog + is unregistered. */ +} + +static struct ipmi_smi_watcher smi_watcher = +{ + .new_smi = ipmi_new_smi, + .smi_gone = ipmi_smi_gone +}; + +static int __init ipmi_wdog_init(void) +{ + int rv; + + if (strcmp(action, "reset") == 0) { + action_val = WDOG_TIMEOUT_RESET; + } else if (strcmp(action, "power_cycle") == 0) { + action_val = WDOG_TIMEOUT_POWER_CYCLE; + } else if (strcmp(action, "power_off") == 0) { + action_val = WDOG_TIMEOUT_POWER_DOWN; + } else { + action_val = WDOG_TIMEOUT_RESET; + printk("ipmi_watchdog: Unknown action '%s', defaulting to" + " reset\n", action); + } + + if (strcmp(preaction, "none") == 0) { + preaction_val = WDOG_PRETIMEOUT_NONE; + } else if (strcmp(preaction, "pre_smi") == 0) { + preaction_val = WDOG_PRETIMEOUT_SMI; +#ifdef HAVE_NMI_HANDLER + } else if (strcmp(preaction, "pre_nmi") == 0) { + preaction_val = WDOG_PRETIMEOUT_NMI; +#endif + } else if (strcmp(preaction, "pre_int") == 0) { + preaction_val = WDOG_PRETIMEOUT_MSG_INT; + } else { + action_val = WDOG_PRETIMEOUT_NONE; + printk("ipmi_watchdog: Unknown preaction '%s', defaulting to" + " none\n", preaction); + } + +#ifdef HAVE_NMI_HANDLER + if (preaction_val == WDOG_PRETIMEOUT_NMI) { + rv = request_nmi(&ipmi_nmi_handler); + if (rv) { + printk(KERN_WARNING + "impi_watchdog: Can't register nmi handler\n"); + return rv; + } + } +#endif + + rv = ipmi_smi_watcher_register(&smi_watcher); + if (rv) { +#ifdef HAVE_NMI_HANDLER + if (preaction_val == WDOG_PRETIMEOUT_NMI) + release_nmi(&ipmi_nmi_handler); +#endif + printk(KERN_WARNING + "ipmi_watchdog: can't register smi watcher\n"); + return rv; + } + + register_reboot_notifier(&wdog_reboot_notifier); + notifier_chain_register(&panic_notifier_list, &wdog_panic_notifier); + + printk(KERN_INFO "IPMI watchdog by " + "Corey Minyard (minyard@mvista.com)\n"); + + return 0; +} + +#ifdef MODULE +static void ipmi_unregister_watchdog(void) +{ + int rv; + unsigned long flags; + + spin_lock_irqsave(®ister_lock, flags); + + if (! watchdog_user) + goto out; + +#ifdef HAVE_NMI_HANDLER + if (preaction_val == WDOG_PRETIMEOUT_NMI) + release_nmi(&ipmi_nmi_handler); +#endif + + /* Make sure no one can call us any more. */ + misc_deregister(&ipmi_wdog_miscdev); + + notifier_chain_unregister(&panic_notifier_list, &wdog_panic_notifier); + unregister_reboot_notifier(&wdog_reboot_notifier); + + /* Disable the timer. */ + ipmi_watchdog_state = WDOG_TIMEOUT_NONE; + ipmi_set_timeout(); + + /* Wait to make sure the message makes it out. The lower layer has + pointers to our buffers, we want to make sure they are done before + we release our memory. */ + while (atomic_read(&set_timeout_tofree)) { + schedule_timeout(1); + } + + /* Disconnect from IPMI. */ + rv = ipmi_destroy_user(watchdog_user); + if (rv) { + printk(KERN_WARNING + "IPMI Watchdog, error unlinking from IPMI: %d\n", + rv); + } + watchdog_user = NULL; + + out: + spin_unlock_irqrestore(®ister_lock, flags); +} + +static void __exit ipmi_wdog_exit(void) +{ + ipmi_smi_watcher_unregister(&smi_watcher); + ipmi_unregister_watchdog(); +} +module_exit(ipmi_wdog_exit); +#else +static int __init ipmi_wdog_setup(char *str) +{ + int val; + int rv; + char *option; + + rv = get_option(&str, &val); + if (rv == 0) + return 1; + if (val > 0) + timeout = val; + if (rv == 1) + return 1; + + rv = get_option(&str, &val); + if (rv == 0) + return 1; + if (val >= 0) + pretimeout = val; + if (rv == 1) + return 1; + + while ((option = strsep(&str, ",")) != NULL) { + if (strcmp(str, "reset") == 0) { + action = "reset"; + } + else if (strcmp(str, "power_cycle") == 0) { + action = "power_cycle"; + } + else if (strcmp(str, "power_off") == 0) { + action = "power_off"; + } + else if (strcmp(str, "pre_smi") == 0) { + preaction = "pre_smi"; + } +#ifdef HAVE_NMI_HANDLER + else if (strcmp(str, "pre_nmi") == 0) { + preaction = "pre_nmi"; + } +#endif + else if (strcmp(str, "pre_int") == 0) { + preaction = "pre_int"; + } + else if (strcmp(str, "start_now") == 0) { + start_now = 1; + } else { + printk("Unknown IPMI watchdog option: '%s'\n", str); + } + } + + return 1; +} +__setup("ipmi_wdog=", ipmi_wdog_setup); +#endif + +EXPORT_SYMBOL(ipmi_delayed_shutdown); + +module_init(ipmi_wdog_init); +MODULE_LICENSE("GPL"); diff -urN linux.orig/include/asm-i386/apic.h linux/include/asm-i386/apic.h --- linux.orig/include/asm-i386/apic.h Mon Oct 21 13:26:04 2002 +++ linux/include/asm-i386/apic.h Tue Oct 22 12:40:16 2002 @@ -79,7 +79,6 @@ extern void setup_boot_APIC_clock (void); extern void setup_secondary_APIC_clock (void); extern void setup_apic_nmi_watchdog (void); -extern inline void nmi_watchdog_tick (struct pt_regs * regs); extern int APIC_init_uniprocessor (void); extern void disable_APIC_timer(void); extern void enable_APIC_timer(void); diff -urN linux.orig/include/asm-i386/nmi.h linux/include/asm-i386/nmi.h --- linux.orig/include/asm-i386/nmi.h Mon Oct 21 13:25:52 2002 +++ linux/include/asm-i386/nmi.h Thu Oct 24 20:50:22 2002 @@ -5,26 +5,11 @@ #define ASM_NMI_H #include <linux/pm.h> +#include <linux/rcupdate.h> +#include <linux/sched.h> struct pt_regs; -typedef int (*nmi_callback_t)(struct pt_regs * regs, int cpu); - -/** - * set_nmi_callback - * - * Set a handler for an NMI. Only one handler may be - * set. Return 1 if the NMI was handled. - */ -void set_nmi_callback(nmi_callback_t callback); - -/** - * unset_nmi_callback - * - * Remove the handler previously set. - */ -void unset_nmi_callback(void); - #ifdef CONFIG_PM /** Replace the PM callback routine for NMI. */ @@ -45,5 +30,34 @@ } #endif /* CONFIG_PM */ + + +/** + * Register a handler to get called when an NMI occurs. If the + * handler actually handles the NMI, it should return NOTIFY_OK. If + * it did not handle the NMI, it should return NOTIFY_DONE. It may "or" + * on NOTIFY_STOP_MASK to the return value if it does not want other + * handlers after it to be notified. + */ +#define HAVE_NMI_HANDLER 1 +struct nmi_handler +{ + struct list_head link; /* You must init this before use. */ + + char *dev_name; + void *dev_id; + int (*handler)(void *dev_id, struct pt_regs *regs, int cpu, int handled); + int priority; /* Handlers called in priority order. */ + + /* Don't mess with anything below here. */ + + struct rcu_head rcu; + struct completion complete; +}; + +int request_nmi(struct nmi_handler *handler); + +/* Release will block until the handler is completely free. */ +void release_nmi(struct nmi_handler *handler); #endif /* ASM_NMI_H */ diff -urN linux.orig/include/linux/ipmi.h linux/include/linux/ipmi.h --- linux.orig/include/linux/ipmi.h Wed Dec 31 18:00:00 1969 +++ linux/include/linux/ipmi.h Wed Oct 30 13:51:55 2002 @@ -0,0 +1,516 @@ +/* + * ipmi.h + * + * MontaVista IPMI interface + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __LINUX_IPMI_H +#define __LINUX_IPMI_H + +#include <linux/ipmi_msgdefs.h> + +/* + * This file describes an interface to an IPMI driver. You have to + * have a fairly good understanding of IPMI to use this, so go read + * the specs first before actually trying to do anything. + * + * With that said, this driver provides a multi-user interface to the + * IPMI driver, and it allows multiple IPMI physical interfaces below + * the driver. The physical interfaces bind as a lower layer on the + * driver. They appear as interfaces to the application using this + * interface. + * + * Multi-user means that multiple applications may use the driver, + * send commands, receive responses, etc. The driver keeps track of + * commands the user sends and tracks the responses. The responses + * will go back to the application that send the command. If the + * response doesn't come back in time, the driver will return a + * timeout error response to the application. Asynchronous events + * from the BMC event queue will go to all users bound to the driver. + * The incoming event queue in the BMC will automatically be flushed + * if it becomes full and it is queried once a second to see if + * anything is in it. Incoming commands to the driver will get + * delivered as commands. + * + * This driver provides two main interfaces: one for in-kernel + * applications and another for userland applications. The + * capabilities are basically the same for both interface, although + * the interfaces are somewhat different. The stuff in the + * #ifdef KERNEL below is the in-kernel interface. The userland + * interface is defined later in the file. */ + + + +/* + * This is an overlay for all the address types, so it's easy to + * determine the actual address type. This is kind of like addresses + * work for sockets. + */ +#define IPMI_MAX_ADDR_SIZE 32 +struct ipmi_addr +{ + /* Try to take these from the "Channel Medium Type" table + in section 6.5 of the IPMI 1.5 manual. */ + int addr_type; + short channel; + char data[IPMI_MAX_ADDR_SIZE]; +}; + +/* + * When the address is not used, the type will be set to this value. + * The channel is the BMC's channel number for the channel (usually + * 0), or IPMC_BMC_CHANNEL if communicating directly with the BMC. + */ +#define IPMI_SYSTEM_INTERFACE_ADDR_TYPE 0x0c +struct ipmi_system_interface_addr +{ + int addr_type; + short channel; + unsigned char lun; +}; + +/* An IPMB Address. */ +#define IPMI_IPMB_ADDR_TYPE 0x01 +/* Used for broadcast get device id as described in section 17.9 of the + IPMI 1.5 manual. */ +#define IPMI_IPMB_BROADCAST_ADDR_TYPE 0x41 +struct ipmi_ipmb_addr +{ + int addr_type; + short channel; + unsigned char slave_addr; + unsigned char lun; +}; + + +/* + * Channel for talking directly with the BMC. When using this + * channel, This is for the system interface address type only. FIXME + * - is this right, or should we use -1? + */ +#define IPMI_BMC_CHANNEL 0xf +#define IPMI_NUM_CHANNELS 0x10 + + +/* + * A raw IPMI message without any addressing. This covers both + * commands and responses. The completion code is always the first + * byte of data in the response (as the spec shows the messages laid + * out). + */ +struct ipmi_msg +{ + unsigned char netfn; + unsigned char cmd; + unsigned short data_len; + unsigned char *data; +}; + +/* + * Various defines that are useful for IPMI applications. + */ +#define IPMI_INVALID_CMD_COMPLETION_CODE 0xC1 +#define IPMI_TIMEOUT_COMPLETION_CODE 0xC3 +#define IPMI_UNKNOWN_ERR_COMPLETION_CODE 0xff + + +/* + * Receive types for messages coming from the receive interface. This + * is used for the receive in-kernel interface and in the receive + * IOCTL. + */ +#define IPMI_RESPONSE_RECV_TYPE 1 /* A response to a command */ +#define IPMI_ASYNC_EVENT_RECV_TYPE 2 /* Something from the event queue */ +#define IPMI_CMD_RECV_TYPE 3 /* A command from somewhere else */ +/* Note that async events and received commands do not have a completion + code as the first byte of the incoming data, unlike a response. */ + + + +#ifdef __KERNEL__ + +/* + * The in-kernel interface. + */ +#include <linux/list.h> + +/* Opaque type for a IPMI message user. One of these is needed to + send and receive messages. */ +typedef struct ipmi_user *ipmi_user_t; + +/* + * Stuff coming from the recieve interface comes as one of these. + * They are allocated, the receiver must free them with + * ipmi_free_recv_msg() when done with the message. The link is not + * used after the message is delivered, so the upper layer may use the + * link to build a linked list, if it likes. + */ +struct ipmi_recv_msg +{ + struct list_head link; + + /* The type of message as defined in the "Receive Types" + defines above. */ + int recv_type; + + ipmi_user_t user; + struct ipmi_addr addr; + long msgid; + struct ipmi_msg msg; + + /* Call this when done with the message. It will presumably free + the message and do any other necessary cleanup. */ + void (*done)(struct ipmi_recv_msg *msg); + + /* Place-holder for the data, don't make any assumptions about + the size or existance of this, since it may change. */ + unsigned char msg_data[IPMI_MAX_MSG_LENGTH]; +}; + +/* Allocate and free the receive message. */ +static inline void ipmi_free_recv_msg(struct ipmi_recv_msg *msg) +{ + msg->done(msg); +} +struct ipmi_recv_msg *ipmi_alloc_recv_msg(void); + +struct ipmi_user_hndl +{ + /* Routine type to call when a message needs to be routed to + the upper layer. This will be called with some locks held, + the only IPMI routines that can be called are ipmi_request + and the alloc/free operations. */ + void (*ipmi_recv_hndl)(struct ipmi_recv_msg *msg, + void *handler_data); + + /* Called when the interface detects a watchdog pre-timeout. If + this is NULL, it will be ignored for the user. */ + void (*ipmi_watchdog_pretimeout)(void *handler_data); +}; + +/* Create a new user of the IPMI layer on the given interface number. */ +int ipmi_create_user(unsigned int if_num, + struct ipmi_user_hndl *handler, + void *handler_data, + ipmi_user_t *user); + +/* Destroy the given user of the IPMI layer. */ +int ipmi_destroy_user(ipmi_user_t user); + +/* Get the IPMI version of the BMC we are talking to. */ +void ipmi_get_version(ipmi_user_t user, + unsigned char *major, + unsigned char *minor); + +/* Set and get the slave address and LUN that we will use for our + source messages. Note that this affects the interface, not just + this user, so it will affect all users of this interface. This is + so some initialization code can come in and do the OEM-specific + things it takes to determine your address (if not the BMC) and set + it for everyone else. */ +void ipmi_set_my_address(ipmi_user_t user, + unsigned char address); +unsigned char ipmi_get_my_address(ipmi_user_t user); +void ipmi_set_my_LUN(ipmi_user_t user, + unsigned char LUN); +unsigned char ipmi_get_my_LUN(ipmi_user_t user); + +/* + * Send a command request from the given user. The address is the + * proper address for the channel type. If this is a command, then + * the message response comes back, the receive handler for this user + * will be called with the given msgid value in the recv msg. If this + * is a response to a command, then the msgid will be used as the + * sequence number for the response (truncated if necessary), so when + * sending a response you should use the sequence number you received + * in the msgid field of the received command. If the priority is > + * 0, the message will go into a high-priority queue and be sent + * first. Otherwise, it goes into a normal-priority queue. + */ +int ipmi_request(ipmi_user_t user, + struct ipmi_addr *addr, + long msgid, + struct ipmi_msg *msg, + int priority); + +/* + * Like ipmi_request, but lets you specify the slave return address. + */ +int ipmi_request_with_source(ipmi_user_t user, + struct ipmi_addr *addr, + long msgid, + struct ipmi_msg *msg, + int priority, + unsigned char source_address, + unsigned char source_lun); + +/* + * Like ipmi_request, but with messages supplied. This will not + * allocate any memory, and the messages may be statically allocated + * (just make sure to do the "done" handling on them). Note that this + * is primarily for the watchdog timer, since it should be able to + * send messages even if no memory is available. This is subject to + * change as the system changes, so don't use it unless you REALLY + * have to. + */ +int ipmi_request_supply_msgs(ipmi_user_t user, + struct ipmi_addr *addr, + long msgid, + struct ipmi_msg *msg, + void *supplied_smi, + struct ipmi_recv_msg *supplied_recv, + int priority); + +/* + * When commands come in to the SMS, the user can register to receive + * them. Only one user can be listening on a specific netfn/cmd pair + * at a time, you will get an EBUSY error if the command is already + * registered. If a command is received that does not have a user + * registered, the driver will automatically return the proper + * error. + */ +int ipmi_register_for_cmd(ipmi_user_t user, + unsigned char netfn, + unsigned char cmd); +int ipmi_unregister_for_cmd(ipmi_user_t user, + unsigned char netfn, + unsigned char cmd); + +/* + * When the user is created, it will not receive IPMI events by + * default. The user must set this to TRUE to get incoming events. + * The first user that sets this to TRUE will receive all events that + * have been queued while no one was waiting for events. + */ +int ipmi_set_gets_events(ipmi_user_t user, int val); + +/* + * Register the given user to handle all received IPMI commands. This + * will fail if anyone is registered as a command receiver or if + * another is already registered to receive all commands. NOTE THAT + * THIS IS FOR EMULATION USERS ONLY, DO NOT USER THIS FOR NORMAL + * STUFF. + */ +int ipmi_register_all_cmd_rcvr(ipmi_user_t user); +int ipmi_unregister_all_cmd_rcvr(ipmi_user_t user); + + +/* + * Called when a new SMI is registered. This will also be called on + * every existing interface when a new watcher is registered with + * ipmi_smi_watcher_register(). + */ +struct ipmi_smi_watcher +{ + struct list_head link; + + /* These two are called with read locks held for the interface + the watcher list. So you can add and remove users from the + IPMI interface, send messages, etc., but you cannot add + or remove SMI watchers or SMI interfaces. */ + void (*new_smi)(int if_num); + void (*smi_gone)(int if_num); +}; + +int ipmi_smi_watcher_register(struct ipmi_smi_watcher *watcher); +int ipmi_smi_watcher_unregister(struct ipmi_smi_watcher *watcher); + +/* The following are various helper functions for dealing with IPMI + addresses. */ + +/* Return the maximum length of an IPMI address given it's type. */ +unsigned int ipmi_addr_length(int addr_type); + +/* Validate that the given IPMI address is valid. */ +int ipmi_validate_addr(struct ipmi_addr *addr, int len); + +/* Return 1 if the given addresses are equal, 0 if not. */ +int ipmi_addr_equal(struct ipmi_addr *addr1, struct ipmi_addr *addr2); + +#endif /* __KERNEL__ */ + + +/* + * The userland interface + */ + +/* + * The userland interface for the IPMI driver is a standard character + * device, with each instance of an interface registered as a minor + * number under the major character device. + * + * The read and write calls do not work, to get messages in and out + * requires ioctl calls because of the complexity of the data. select + * and poll do work, so you can wait for input using the file + * descriptor, you just can use read to get it. + * + * In general, you send a command down to the interface and receive + * responses back. You can use the msgid value to correlate commands + * and responses, the driver will take care of figuring out which + * incoming messages are for which command and find the proper msgid + * value to report. You will only receive reponses for commands you + * send. Asynchronous events, however, go to all open users, so you + * must be ready to handle these (or ignore them if you don't care). + * + * The address type depends upon the channel type. When talking + * directly to the BMC (IPMC_BMC_CHANNEL), the address is ignored + * (IPMI_UNUSED_ADDR_TYPE). When talking to an IPMB channel, you must + * supply a valid IPMB address with the addr_type set properly. + * + * When talking to normal channels, the driver takes care of the + * details of formatting and sending messages on that channel. You do + * not, for instance, have to format a send command, you just send + * whatever command you want to the channel, the driver will create + * the send command, automatically issue receive command and get even + * commands, and pass those up to the proper user. + */ + + +/* The magic IOCTL value for this interface. */ +#define IPMI_IOC_MAGIC 'i' + + +/* Messages sent to the interface are this format. */ +struct ipmi_req +{ + unsigned char *addr; /* Address to send the message to. */ + unsigned int addr_len; + + long msgid; /* The sequence number for the message. This + exact value will be reported back in the + response to this request if it is a command. + If it is a response, this will be used as + the sequence value for the response. */ + + struct ipmi_msg msg; +}; +/* + * Send a message to the interfaces. error values are: + * - EFAULT - an address supplied was invalid. + * - EINVAL - The address supplied was not valid, or the command + * was not allowed. + * - EMSGSIZE - The message to was too large. + * - ENOMEM - Buffers could not be allocated for the command. + */ +#define IPMICTL_SEND_COMMAND _IOR(IPMI_IOC_MAGIC, 13, \ + struct ipmi_req) + +/* Messages received from the interface are this format. */ +struct ipmi_recv +{ + int recv_type; /* Is this a command, response or an + asyncronous event. */ + + unsigned char *addr; /* Address the message was from is put + here. The caller must supply the + memory. */ + unsigned int addr_len; /* The size of the address buffer. + The caller supplies the full buffer + length, this value is updated to + the actual message length when the + message is received. */ + + long msgid; /* The sequence number specified in the request + if this is a response. If this is a command, + this will be the sequence number from the + command. */ + + struct ipmi_msg msg; /* The data field must point to a buffer. + The data_size field must be set to the + size of the message buffer. The + caller supplies the full buffer + length, this value is updated to the + actual message length when the message + is received. */ +}; + +/* + * Receive a message. error values: + * - EAGAIN - no messages in the queue. + * - EFAULT - an address supplied was invalid. + * - EINVAL - The address supplied was not valid. + * - EMSGSIZE - The message to was too large to fit into the message buffer, + * the message will be left in the buffer. */ +#define IPMICTL_RECEIVE_MSG _IOWR(IPMI_IOC_MAGIC, 12, \ + struct ipmi_recv) + +/* + * Like RECEIVE_MSG, but if the message won't fit in the buffer, it + * will truncate the contents instead of leaving the data in the + * buffer. + */ +#define IPMICTL_RECEIVE_MSG_TRUNC _IOWR(IPMI_IOC_MAGIC, 11, \ + struct ipmi_recv) + +/* Register to get commands from other entities on this interface. */ +struct ipmi_cmdspec +{ + unsigned char netfn; + unsigned char cmd; +}; + +/* + * Register to receive a specific command. error values: + * - EFAULT - an address supplied was invalid. + * - EBUSY - The netfn/cmd supplied was already in use. + * - ENOMEM - could not allocate memory for the entry. + */ +#define IPMICTL_REGISTER_FOR_CMD _IOR(IPMI_IOC_MAGIC, 14, \ + struct ipmi_cmdspec) +/* + * Unregister a regsitered command. error values: + * - EFAULT - an address supplied was invalid. + * - ENOENT - The netfn/cmd was not found registered for this user. + */ +#define IPMICTL_UNREGISTER_FOR_CMD _IOR(IPMI_IOC_MAGIC, 15, \ + struct ipmi_cmdspec) + +/* + * Set whether this interface receives events. Note that the first + * user registered for events will get all pending events for the + * interface. error values: + * - EFAULT - an address supplied was invalid. + */ +#define IPMICTL_SET_GETS_EVENTS_CMD _IOR(IPMI_IOC_MAGIC, 16, int) + +/* + * Set and get the slave address and LUN that we will use for our + * source messages. Note that this affects the interface, not just + * this user, so it will affect all users of this interface. This is + * so some initialization code can come in and do the OEM-specific + * things it takes to determine your address (if not the BMC) and set + * it for everyone else. You should probably leave the LUN alone. + */ +#define IPMICTL_SET_MY_ADDRESS_CMD _IOR(IPMI_IOC_MAGIC, 17, unsigned int) +#define IPMICTL_GET_MY_ADDRESS_CMD _IOR(IPMI_IOC_MAGIC, 18, unsigned int) +#define IPMICTL_SET_MY_LUN_CMD _IOR(IPMI_IOC_MAGIC, 19, unsigned int) +#define IPMICTL_GET_MY_LUN_CMD _IOR(IPMI_IOC_MAGIC, 20, unsigned int) + +#endif /* __LINUX_IPMI_H */ diff -urN linux.orig/include/linux/ipmi_msgdefs.h linux/include/linux/ipmi_msgdefs.h --- linux.orig/include/linux/ipmi_msgdefs.h Wed Dec 31 18:00:00 1969 +++ linux/include/linux/ipmi_msgdefs.h Thu Aug 22 08:26:11 2002 @@ -0,0 +1,58 @@ +/* + * ipmi_smi.h + * + * MontaVista IPMI system management interface + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __LINUX_IPMI_MSGDEFS_H +#define __LINUX_IPMI_MSGDEFS_H + +/* Various definitions for IPMI messages used by almost everything in + the IPMI stack. */ + +#define IPMI_NETFN_APP_REQUEST 0x06 +#define IPMI_NETFN_APP_RESPONSE 0x07 + +#define IPMI_BMC_SLAVE_ADDR 0x20 + +#define IPMI_GET_DEVICE_ID_CMD 0x01 + +#define IPMI_CLEAR_MSG_FLAGS_CMD 0x30 +#define IPMI_GET_MSG_FLAGS_CMD 0x31 +#define IPMI_SEND_MSG_CMD 0x34 +#define IPMI_GET_MSG_CMD 0x33 + +#define IPMI_SET_BMC_GLOBAL_ENABLES_CMD 0x2e +#define IPMI_GET_BMC_GLOBAL_ENABLES_CMD 0x2f +#define IPMI_READ_EVENT_MSG_BUFFER_CMD 0x35 + +#define IPMI_MAX_MSG_LENGTH 80 + +#endif /* __LINUX_IPMI_MSGDEFS_H */ diff -urN linux.orig/include/linux/ipmi_smi.h linux/include/linux/ipmi_smi.h --- linux.orig/include/linux/ipmi_smi.h Wed Dec 31 18:00:00 1969 +++ linux/include/linux/ipmi_smi.h Sun Oct 13 16:25:50 2002 @@ -0,0 +1,143 @@ +/* + * ipmi_smi.h + * + * MontaVista IPMI system management interface + * + * Author: MontaVista Software, Inc. + * Corey Minyard <minyard@mvista.com> + * source@mvista.com + * + * Copyright 2002 MontaVista Software Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, + * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS + * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __LINUX_IPMI_SMI_H +#define __LINUX_IPMI_SMI_H + +#include <linux/ipmi_msgdefs.h> + +/* This files describes the interface for IPMI system management interface + drivers to bind into the IPMI message handler. */ + +/* Structure for the low-level drivers. */ +typedef struct ipmi_smi *ipmi_smi_t; + +/* + * Messages to/from the lower layer. The smi interface will take one + * of these to send. After the send has occurred and a response has + * been received, it will report this same data structure back up to + * the upper layer. If an error occurs, it should fill in the + * response with an error code in the completion code location. When + * asyncronous data is received, one of these is allocated, the + * data_size is set to zero and the response holds the data from the + * get message or get event command that the interface initiated. + * Note that it is the interfaces responsibility to detect + * asynchronous data and messages and request them from the + * interface. + */ +struct ipmi_smi_msg +{ + struct list_head link; + + long msgid; + void *user_data; + + /* If 0, add to the end of the queue. If 1, add to the beginning. */ + int prio; + + int data_size; + unsigned char data[IPMI_MAX_MSG_LENGTH]; + + int rsp_size; + unsigned char rsp[IPMI_MAX_MSG_LENGTH]; + + /* Will be called when the system is done with the message + (presumably to free it). */ + void (*done)(struct ipmi_smi_msg *msg); +}; + +struct ipmi_smi_handlers +{ + /* Called to enqueue an SMI message to be sent. This + operation is not allowed to fail. If an error occurs, it + should report back the error in a received message. It may + do this in the current call context, since no write locks + are held when this is run. If the priority is > 0, the + message will go into a high-priority queue and be sent + first. Otherwise, it goes into a normal-priority queue. */ + void (*sender)(void *send_info, + struct ipmi_smi_msg *msg, + int priority); + + /* Called by the upper layer to request that we try to get + events from the BMC we are attached to. */ + void (*request_events)(void *send_info); + + /* Called when someone is using the interface, so the module can + adjust it's use count. */ + void (*new_user)(void *send_info); + + /* Called when someone is no longer using the interface, so the + module can adjust it's use count. */ + void (*user_left)(void *send_info); + + /* Called when the interface should go into "run to + completion" mode. If this call sets the value to true, the + interface should make sure that all messages are flushed + out and that none are pending, and any new requests are run + to completion immediately. */ + void (*set_run_to_completion)(void *send_info, int run_to_completion); +}; + +/* Add a low-level interface to the IPMI driver. */ +int ipmi_register_smi(struct ipmi_smi_handlers *handlers, + void *send_info, + unsigned char version_major, + unsigned char version_minor, + ipmi_smi_t *intf); + +/* + * Remove a low-level interface from the IPMI driver. This will + * return an error if the interface is still in use by a user. + */ +int ipmi_unregister_smi(ipmi_smi_t intf); + +/* + * The lower layer reports received messages through this interface. + * The data_size should be zero if this is an asyncronous message. If + * the lower layer gets an error sending a message, it should format + * an error response in the message response. + */ +void ipmi_smi_msg_received(ipmi_smi_t intf, + struct ipmi_smi_msg *msg); + +/* The lower layer received a watchdog pre-timeout on interface. */ +void ipmi_smi_watchdog_pretimeout(ipmi_smi_t intf); + +struct ipmi_smi_msg *ipmi_alloc_smi_msg(void); +static inline void ipmi_free_smi_msg(struct ipmi_smi_msg *msg) +{ + msg->done(msg); +} + +#endif /* __LINUX_IPMI_SMI_H */ diff -urN linux.orig/include/linux/nmi.h linux/include/linux/nmi.h --- linux.orig/include/linux/nmi.h Thu Jun 20 17:53:40 2002 +++ linux/include/linux/nmi.h Thu Oct 24 16:28:53 2002 @@ -1,22 +1,11 @@ /* - * linux/include/linux/nmi.h + * linux/include/linux/nmi.h + * + * (C) 2002 Corey Minyard <cminyard@mvista.com> + * + * Include file for NMI handling. */ -#ifndef LINUX_NMI_H -#define LINUX_NMI_H - -#include <asm/irq.h> - -/** - * touch_nmi_watchdog - restart NMI watchdog timeout. - * - * If the architecture supports the NMI watchdog, touch_nmi_watchdog() - * may be used to reset the timeout - for code which intentionally - * disables interrupts for a long time. This call is stateless. - */ -#ifdef ARCH_HAS_NMI_WATCHDOG -extern void touch_nmi_watchdog(void); -#else -# define touch_nmi_watchdog() do { } while(0) -#endif +#if defined(__i386__) +#include <asm/nmi.h> #endif diff -urN linux.orig/include/linux/nmi_watchdog.h linux/include/linux/nmi_watchdog.h --- linux.orig/include/linux/nmi_watchdog.h Thu Oct 24 19:56:54 2002 +++ linux/include/linux/nmi_watchdog.h Thu Oct 24 12:50:30 2002 @@ -0,0 +1,22 @@ +/* + * linux/include/linux/nmi.h + */ +#ifndef LINUX_NMI_WATCHDOG_H +#define LINUX_NMI_WATCHDOG_H + +#include <asm/irq.h> + +/** + * touch_nmi_watchdog - restart NMI watchdog timeout. + * + * If the architecture supports the NMI watchdog, touch_nmi_watchdog() + * may be used to reset the timeout - for code which intentionally + * disables interrupts for a long time. This call is stateless. + */ +#ifdef ARCH_HAS_NMI_WATCHDOG +extern void touch_nmi_watchdog(void); +#else +# define touch_nmi_watchdog() do { } while(0) +#endif + +#endif